m 



MIT/LCS/TR-237 

TOWARDS A THEORY 

FOR 

ABSTRACT DATA TYPES 

Deepak Kapur 



This blank page was inserted to preserve pagination. 



TOWARDS A THEORY FOR ABSTRACT DATA TYPES 

DEEPAK KAPUR 



Copyright Massachusetts Institute of Technology 1980 

May 1980 



This research was supported in part by the Advanced Research Projects Agency of the 
Department of Defense, monitored by the Office of Naval Research under contract 
N00014-75-C-0661, and in part by the National Science Foundation under grant 
MCS'74-21892'A01. 



Massachusetts Institute of Technology 
Laboratory for Computer Science 

Cambridge Massachusetts 02139 



This empty page was substituted for a 
blank page in the original document. 



Abstract 



A rigorous framework for studying immutable data types having nondeterministic 
operations and operations exhibiting exceptional behavior is developed. The framework 
embodies the view of a data type taken m piogmrammg language^ and supports 
hierarchical and modular structure among data typo. 

The central notion in this framework is the definition- of a data type. An algebraic and 
behavioral approach for defining a data type is developed which focuses on the 
input-output behavior of a data type as observed through its operations. The definition of 
a data type abstracts from the represenlati©»afc'StPictUie'«i^ils values as we$ as from the 
multiple representations of die values for any repnafentational structure. 

A hierarchical specification language for data types is proposed. The semantics of a 
specification is a set of related data types whose operations have the behavior captured by 
the specification. A clear distinction is made between a data type and its specifications). 
The norma* behavior and die cxcepB^^ri *etnwi^iotAe 'operations are specified 
separately. The specification language provides mechanisms to specify (i) a precondition 
for an operation thus stating its intended inputs, fti) the exceptions which must be signalled 
by the operations, and (iii) the exceptions which the operations can optionally signal. Two 
properties of a specification, consistency and behavioral completeness, are defined A 
consistent specification is guaranteed to specify at least one data type. A behaviorally 
complete specification 'completely' specifies the observable behavior of the operations on 
their intended inputs. 

A deductive system based on first order multi-sorted predicate calculus with identity is 
developed for abstract data types. It embodies the general properties of data types, which 
are not explicitly stated in a specification. The theory of a data type, which consists of a 
subset of the first order properties of the data type, is constructed from its specification. 
The theory is used in verifying programs and designs expressed using the data type. Two 
properties of a specification, well definedness and completeness, are defined based on what 
can be proved from it using different fragments of the deductive system. The sufficient 
completeness property of Gutlag and JHornihg is also formalized and related to die 
behavioral completeness pfoperty. The well definedness property is stronger man the 
consistency property, because the well definedness pfopeH^ not only requires that the 
specification specifies at least one data type, but also captures the intuition that it preserves 
other specifications used in it thus ensuring modular structure among specifications. The 



-3- 



completeness probity is stronger than the sufrkient completeness property, since in 
addition to the requirement that the behavior of the observers can be deduced on any 
intended input by equational reasoning, it also requires that the equivalence of the 
observable effect of the constructors can be deduced from the specification by equational 
reasoning. 

A correctness criterion is proposed for an implementation coded in a programming 
language with respect to a specification. It is defined as a relation between the semantics of 
an implementation and the semantics of a specification. It does not require a correct 
implementation to have the maximum amount of nondeierminism specified by a 
specification. A methodology for proving correctness of an implementation is developed 
which embodies the correctness criterion. 



Name and Title of Thesis Supervisor: Barbara H. Uskov 

Associate Professor of Electrical Engineering 
and Computer Science 

Key Words and Phrases: Abstract Data Type, Data Type* Data Abstraction, Type Algebras, 

Nondeterminism. Exceptions. Specification Language, Semantics, 
Consistency, Beliavioi^ Ccsnpkteaes^ Deduetive System, 
Verification, Proof Technique, Sufficient Completeness, 
Completeness, Well Defmedness, Implementation Correctness 



This report is a minor revision of a thesis of me same title submitted to the Department of 
Electrical Engineering and Computer Science in March, *80 in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy. 



Acknowledgments 

] am thankful to my thesis supervisor, Professor Barbara Liskov, for her patience and 
encouragement during the thesis research and especially during the later stages; to 
Professor John Guttag for posing many challenges and for many suggestions leading to 
improvements in the presentation of the thesis; to Professor Carl Hewitt for helping me 
organize and present my ideas in the early stage of the research; and to Professor Hal 
Abelson for diligently reading the final draft and making many helpful comments. 

My officemates, Valdis Berzins, Srivas Mandayam, and Carl Seaquist have helped me in 
many ways during the thesis research. They gave me an audience whenever I needed, 
helped me organize my ideas, and found time to read my work whenever 1 asked them 
irrespective of their other important responsibilities. Girl and Srivas provided a very 
stimulating and encouraging atmosphere during the last year. I am also thankful to Russ 
Atkinson, Moms Krishnamurthy, Dave Musser, Gene Stark, and Jeannette Wing for their 
helpful comments. Eliot Moss is to be thanked for producing and maintaining the software 
necessary for the production of this document 

The graduate study at MIT has provided me a unique opportunity to live outside of my 
own country which has been a tremendous learning experience. Besides computer science, 
I have learnt a great deal about life, this country, my country, and myself, which has 
fundamentally changed my attitude and outlook towards life. For this, 1 am indebted to 
the students and staff of the Seminar on International Students and Their Participation in 
Development, and my friends, especially Arvind, Ashok, Carl, Kanchan, Krishna, 
Mukundan, Nagu, Ravi, Rashid, Sekhar, Srivas, Vaqar, and Vinod. Without their 
encouragement and interest, continuing the thesis research would not have been possible. 
Roli has contributed to the completion of the thesis in her own unique way; in no way can I 
adequately express my gratitude to her. 

This research was supported in part by the Advanced Research Projects Agency of the 
Department of Defense, monitored by the Office of Naval Research under contract 
N00014-75-C-0661, and in part by the National Science Foundation under grant 
MCS74-21892'A01. 



This empty page was substituted for a 
blank page in the original document. 



-5 



Table of Contents 



1. Introduction ......., ? ,.., Tf ..«r- 9- 

1 . Scope and AfipfQ*i»ft^MI?f..}B^^ "* ^ ■ 

1. Scope and Assumptions ■••••« »,.......«.«....■••••••«•■••■••••■■• ''• 

2. Definition of a Dili "Type .......... .......I.. .....;. 11. 

3. Specification Method .... ............. ....*.i..^.....^ii.". ..*■■•••«■•• 13. 

4. Deductive System .....; 17. 

5. Correctnessef Implementation .;.~ *..... ^....1.^..........*. 18. 

2. Related Work .„.v^..;....:..;.:.. ...... ....^vJ^;.... ..... 19. 

3. Outllneof the TTie»l« .........:...........■...■■".■•.■•■■•••■■••■••••• 22 - 

2. Definition of an Abstract Data Type ../.......„........ 23. 

1. Informal Description of adE>eia;ljf#e >.«.w. ..,,,...*,,....*.... 26. 

1. Terminology , . »•• .,..•«■■.••■«••» ■ 26 - 

2. Hierarchical Structure ,....„.^»»^...«..i..<..>'*.~-«".." .•••• *8. 

3. Minimality Property ....«...*..»«,....«,M?-«»»>t*^'«*—>» • **• 

2. Formalism ■ • 3l - 

1. Type Algebras ..,.......».....^...............«^.<^<»..ff. .«».■.'•••"■•■"" «**• 

2. Examples of Type Algebras 35. 

3. Interpretation of Toi'rtia ...... i..'.i^..»...ii*. «.•..«?...•»«.••*•••»•■•-.«■» **' • 

4. Observable Behavior 39. 

1. Definitions of Observable Equivalence and, 
Distinguishabinty ....'.....:...:... • 41. 

2. Reduced Algebras ..... v ..., 45. 

5. Behavioral Equivalence o# ;J$0mQ$g/?iMMtf.,.~* r **-~<..----~ 45 - 

6. Definition of a Data Typo^.v-^•»»r•1«*»>•»f?'■»•*^•*?*'•■'•t"•*•*■'••*"*••"■• 

7. Observable Equivalence and Distfngulshabfffty of Terms 51. 

3. Exceptional Behavior of atfara^sgf .. ? .^.:..... 53. 

1 . Assumptions about Exception Handling Mechanism ......... 53. 

2. Formalism •••• „,^,.,^,. w ^.., oo. 

1. Terms, Exception Terms, and kits (pre tations 57. 

2. Examples of Modified Tjfpe ^^^fa» t .^.,...«....«-' S 8 * 

3. Observable Behavior and DistinguisJvaWUty 59. 

4. Comparison with Goguen's Approach ,...„.,.*.•.••—• ••• 62 ' 

3. A Simpler Approach ..* ;.\.....»........ 63. 

4. Mutually Recursive Data Typee 66. 



6- 



3. Specification of an Abstract Data Type 68. 

1. Specification Language . 72. 

1 . Operations 73. 

2. Auxiliary Function* 74. 

3. Restrictions 77. 

1. Preconditions 77. 

2. Exception Conditions 70. 

3. Discussion 80. 

4. Axioms 81. 

5. Specifying Nondeterministic Operations 83. 

6. Specification of Mutually Recursive D^tfTyiies 85. 

2. Semantics of Specification Language 86. 

1. Specifications without Auxiliary Functions 87. 

1. Restrictions 88. 

2. Axioms ,... „....,.,..... 89. 

2. Specifications with Auxiliary Functions 91. 

3. Semantics of a Specification 92. 

3. Specification of a Data Type and 

Equivalence of Specification* 94; 

4. Specification of Bool 98. 

5. Properties of a Specification 99. 

1 . Consistency 99. 

2. Behavioral Completeness ..^.... 102. 

1. Partial isomorphic Equivalence 103. 

2. Isomorphic Embeddability 104. 

3. Partial Isomorphic Embeddability 106. 

4. Definition of Behavioral Completeness 106. 

6. Comparison With Related Works 109. 

4. Deductive System 112. 

1. Preliminaries 115. 

2. Theory of Data Types without Nondetermlmsm and 
without Exceptional Behavior 119. 

1. Derivation of Non logical Axioms 121. 

2. Equational Subtheory 122. 

3. DistinguishabHity Subtheory 123. 

4. Inductive Subtheory 124. 

1. infinite Induction ftufe 125. 

2. Rationale for an Infinite Induction Rule 126. 

3. Use of the Induction Rule 128. 

4. Specifications with Montrrviai Preconditions 

for Constructors 131. 

5. The Full Theory 134. 

6. Properties of a Specification . 136. 



1. Sufficient Completeness «>.*£«*..»;..■.* 138. 

2. Completeness ., ... ■ • 141 - 

3. wiBftbi»ffn8dne«s . •■ • 1 42 - 

7. Automation of IND(SJ. «.» ? • 143 ' 

3. Theory of Exceptions Without Nondeterminism 144. 

1. Derivatioivof Nontogical Axioms ...... s ;;u.«...~...«;.».. 145. 

1. Restrictions Component 1 45 - 

2. Axioms Component ..;..;............«....■.. •«•».*•••*•■»• i«*o. 

3. Definition of N?rv ..•••••••.•••••••••"•••••••■•'•4< fi »*'»f»**v*»i" # ""* - • *»' • 

2. Equational Subtheory 1*9. 

3. Distinguishability Subtheory »... ♦....«> 46-% 

4. Inductive Subtheory 1 52 - 

5. The FttH Theory »... ...>.......,«.,.»«<*,. ..«.......ospm«-. ,...«...-..«•• i*>». 

6. Properties of a Specification 157. 

1. Sufficient Completeness 158. 

2. Cc<npleter^ssart*WeHOefln*d#»is« , .......i«.«"...*-. .—• • 160. 

4. Theory of Nondeterminism ........w...M..v....i...u...»....*."- 161. 

1 . Transformation Procedure TR ~ .;....,..„......t..;........... 1 63. 

2. Th(S) • ...,,....,~.- ? v 167 - 

3. Data Types With Exceptional Behavior 168. 

4. P rope rtfes of a Specif ication • » «*• 

5. Strong Equivalence of Specifications 175. 

5. Correctness of Implementation ,....*.. 176. 

1. Correctness Criterion awl 

Overview of Correctness Afethod „,.....*».......-... 178. 

1 . Semantics of an implementation , — '•••••- ■•••• 1 79 - 

2. Correctness Method ••••— 181 - 

1. Nondeterminism • • 1® 2 - 

2. Definition of Correctness 165. 

2. Implementation Structure and Semantics 187. 

1. Procedures- Approach I 168. 

2. Procedures - Approach II 168. 

3. Properties of the Encapsulation Mechanism 191. 

4. Semantics of an Implementation 195. 

3. Correctness Method 196- 

1. Auxiliary Functions in a Specification 196. 

2. Preservation of Inv 196. 

3. Termination of Procedures 197. 

4. Proving Restrictions and Axioms 197. 

1. Preservation of Equivalence Relation — 198. 

2. Restrictions 19 6- 

3. Axioms ■ 201 - 

5. Nondeterministic Procedures 2 ° 2 - 



8- 



6. Pseudo-Nondeterministic Procedure* 203. 

4. Recursive and Mutually Recursive Implementations ■• 205. 

1. Recursive Implementations 205. 

2. Mutually Recursive Implementations 209. 

6. Conclusions ......*.w.*... 210. 

1. Summary of Contributions 210. 

2. Directions for Further Research 212. 

References 216. 

Appendix I. Elaboration of Scope and Assumptions 224. 

1. Immutable and Mutable Data Types 224. 

2. Exceptional Behavior 225. 

3. Nondeterminism ±... 225. 

Appendix II. Definitions of Algebraic Concepts and 

Proofs of Theorems in Chapter 2 227. 

1. Congruence, Homomorphism, and Isomorphism 227. 

2. Proof of Theorem 2.2 J.........^.l:...-........,.;.,.... ............ 229. 

3. Elaboration of the Definitionof Behavioral Equivalence 
and Proofs of Theorems 2.5 and 2.6 ia*.* .-.*.**•.«*.. ... 230. 

Appendix III. Proofs of Theorems ffi Chapter 4 236. 

Appendix JV. Specifications of Data used in 

'^^S aGalJl'^y ■ '%# ' •■■■ a ••• * *• mat «•'«"■ "a ■ m •*« * • ■'■'■•«■*■•*■••■ ■ a£^e*%*» 



-9 



1. Introduction 

The role of abstraction, modularity and hierarchical structure has been well 
recognized in the literature on program design and construction [12, 66, 73]. Data 
abstraction, in particular, has been found to be a useful abstraction mechanism in the 
design and construction of well structured programs [51]. 1 Most of the recent 
programming languages encourage the use of abstract data types by providing an 
encapsulation mechanism for implementing them [65, 49, 52, 75, 45, 1]. It is necessary to 
develop a rigorous foundation for abstract data types so that the informal concept of an 
abstract data type can be placed on a firm and sound basis, and various aspects of this 
concept can be studied and analyzed. 

In this thesis, we develop a framework for abstract data types. The central notion 
in this framework is the definition of an abstract data type. We develop a behavioral 
method for defining a class of abstract data types, called immutable data types [49, 52]. An 
immutable data type is defined as a set of behaviorally equivalent algebras having 
interpretations for the values and the operations of the data type. Behaviorally equivalent 
algebras have the same behavior as observed through their operations. We propose a 
specification language for abstract data types. The semantics of a specification is a set of 
related data types sharing the common behavior captured by the specification. We make a 
clear distinction between a data type and its specification(s). We develop a deductive 
system for abstract data types embodying their general properties which are not explicitly 
stated in a specification. We use the deductive system to prove properties of an abstract 
data type from its specification. We propose a correctness criterion for an implementation 
of an abstract data type with respect to its specification, and develop a methodology for 
proving correctness of an implementation with respect to a specification which embodies 
the proposed criterion. 



1. The terms abstract data type, data type, data abstraction, and type are used synonymously in this thesis. 

2. Liskov and Zilles [47] emphasize the need for rigorously developing the mathematical foundation of the 
specification methods for abstract data types. 



-10- 



The main contribution of this research is a framework for abstract data types that 
is rigorous and that brings together various aspects of abstract data types in a unified and 
coherent way. Our approach is better than other similar attempts, in particular the initial 
algebra formalism of the ADJ group [23] and me category theory formalism of Gbgoefl 
[20, 7, 30], because it is more in tune with the way programming languages support the 
mechanism of abstract data type. The framework incorporated important and useful 
features such as hierarchical structure and modularity. It is also broader in scope as it 
handles data types with nondeterministic Ojperatibns and with operations exhibiting 
exceptional behavior. We had originally developed the framework without considering 
nondeterminism and exceptional behavior; however, we did riot encounter any major 
difficulties in extending it to incorporate nondetenninism ^Wdexceptidtiat behavior. Tfiis 
makes us believe that our framework is robust and extensible ibY studying other aspects of 
data type behavior not discussed in this thesis. 

Our framework will be useful to a designer of a specification language for abstract 
data types as it provides a semantic basis for studying and comparing such specification 
languages. It can be used to define the semantics of a spe^rrfcatibh language. It also 
provides a formal basis of automatic deductive systems for abstract data types, such as 
AFFIRM [60]. It suggests an approach for studying and extending the method of 
reasoning about data types developed in the thesis. Other methods of reasoning can also be 
developed using it Furthermore, this research clarifies our intuitions about data type 
behavior and provides a formal basis for them; as examples, the notions of consistency and 
sufficient completeness advocated by Guttag and Horning ; [28J, and the correctness 
criterion for an implementation [29, 40] can be stated formally M analyzed. 

Our research has been highly influenced by Peano's method Of defining natural 
numbers and McCarthy's method of defining S-expressions J57f. We are; intellectually 
indebted to Zilles [77] and the ADJ group [23], for their work on the algebraic approach for 
abstract (fata types, and to Guttag et al. [25,28,29] for their work on specification 
technique for abstract data types which emphasizes programmers' intuitions about data 
types. We cite other related works in Section 1.2, and state fiow^e plan to compare these 
works with that discussed in the thesis. 



11- 



1 .1 Scope and Approach of the Tnesis 



We first state the scope of the thesis and the assumptions made about the data 
type behavior. The scope and assumptions are further discussed in Appendix I. Later, we 
give an overview of the approach taken in studying four issues, namely, definition, 
specification, deductive system, and implementation correctness. 

1.1.1 Scope and Assumptions 

In our research, we have considered immutable data types having 
nondeterministic operations and operations exhibiting exceptional behavior. Every 
operation is assumed to be total and computable: see|42J far a precise characterization of 
computability on the values of a data type, it terminates on every input h\ its domain either 
normally by returning a value of its range type or by signalling an exception. A 
nondeterministic operation has only finitely many choices on an input If a 
nondeterministic operation signals on an input, it is assumed to behave delenninisticaUy on 
that input So, it does not have a choice between signalling and tetminating normally on a 
particular input Henceforth, by a data type, we mean ah hnmatabfe data type wife the 
above behavior, andby an object we mean aftinraHrtable object or a value. 

1 ,1 .2 Definition of a Data Type 

Our formalism for defining a data type is algebraic in the style of Zilles [77] and 
the ADJ group [23]. Algebras are a natural and elegant way to define an immutable data 
type, because an immutable data type is informally a set of values and a set of operations. 
In a programming language supporting data types, the most important aspect of a data type 
to its designer as well as its user is the input output behavior of its operations [37, 47, 25]. 
The values of a data type are manipulated only by its operations. Outside its 
implementation module(s), the values are viewed abstractly as sequences of operations. 
The details about the representations of values and the operations of a data type are of no 



-12- 



relevance. 3 To a user, two distinct representations mtJoi&tmdMty identical # they cannot 
be distinguished by the operations of the data type. We call this view the behavioral view 
of a data type. The behavioral view abstracts from the representational structure of the 
values as well as from the multiple representations of a value for any representational 
structure. It is a further abstraction on the view of a data type adopted by ADJ [23J and 
Zilles [77J which abstracts only from the representational structure of the values. 

In a programming language supporting modularity and hierarchical structure 
such as CLU, EUCLID, etc., data types are implemented hierarchically one at a time 
except that mutually recursive data types are implemented together as a group; data types 
other than those being implemented are assumed to be implemented elsewhere, 4 We take 
the same approach in defining a data type. Our definitionaj method is hierarchical. We 
distinguish between the data type(s) being defined and other data types used in the 
definition. We call the data type(s) being defined the tteftwd Iype(s) and other data types 
in the definition the defining types. The distinction between thedgfined type and defining 
types is significant because the behavior of the values of the defined type is observed by the 
operations which return the values of the defining types. TJas was first pointed out by 
Guttag [25], and is the basis of his definition of the sufficient completeness property. We 
use the data type boolean, which is self-contained and does not have any defining types, as 
the basis of our definitional method. We assume its definition and that all boolean values 
are distinguishable. In fact, any data type whose values can be distinguished a priori 
(outside the formalism) can be used as the basis. For example, any data type directly 
supported in a programming language whose values are distinguishable using the literal 
(constant naming) mechanism in the programming language is a suitable candidate. 

We classify the operations of a data type into two categories - the constructors, 
which construct the values of the data type, and the observers, which return the values Of 



3. Wc will not be concerned about other issues, such as efficiency of the operations, eje relevant to a user of 
a data type. Our formalism is limited in this sense. 

4. Mutually recursive data types arc different from mutually recursive implementations; see Chapter 5 for a 
detailed discussion. 



-li- 



the defining types. A value of a data type manifests its behavior through the observers with 

the help of constructors. 

Our approach for modeling the exceptional behavior embodies a practical view of 
exceptions. Each exception is named, and can have arguments that carry information to its 
handler from the place where it is, signalled. The exceptional behavior of the operations 
can also be used to distinguish among different, values. An operation can distinguish 
between two values by signalling on one value and terminating normally on the other 
value, or by signalling di fferent exceptions on di fferent values. 

The model used for nondeterminism is simple. If a nondeterministic operation 
behaves nondeterministically on an input (i.e., it has a choice to return one of the many 
possible results), we expect it to return every possible result. We do not consider how these 
results are scheduled by an implementation of the operation. Two operations having 
different amounts of nondeteimmism are considered to havedifferent observable behavior 
because for some input, they can always return distinguishable results. Data types with 
operations having different amounts of nondeterminism are thus considered different For 
example, consider a data type finite set qf integers with a nondeterministic operation 
Choose which nondeterministically picks an arbitrary; ^element from a nonempty finite set 
of integers given as an argument This data type is different &om another similar datatype 
with the same set of operations which also have the same behavior with the exception of 
Choose which is deterministic and returns the maximum integer of a nonempty set 
Furthermore, both data types are different from yet a third data type with the same set of 
operations as the other two types except that Choose has a liraitea* amount of 
nondeterminism: Choose nondeterministically picks between the maximum and minimum 
integers from a nonempty set 

1.1.3 Specification Method 

A specification is mainly used, among other things, for reasoning about a data 
type. So, our specification method is axiomatic in the style of Standish [69], Hdare [38, 39], 
Guttag [26, 29], Nakajima et al. [62], etc. A specification embodies information hiding [66], 
Le., it only specifies the behavior of a data type. Our specification method is hierarchical. 



-14- 



Data types are specified incrementally, one at a timt; a specification uses the specifications 
of other data types. We believe that specifications should be modular and well structured 
just like programs; otherwise, specifications of large problems become unmanageable and 
difficult to understand 5 

A specification expresses the properties particular to the date type(s) being 
specified. It specifies (i) the domain, range, and the exceptions with the types of their 
arguments, if any, signalled by every operation, (n) the normal behavior as well as the 
exceptional behavior of the operations. The general properties of data types which hold for 
every data type, for example, the minimality properly which requires that every value of a 
data type is constructed by finitely many applications Of its constructors, are not included in 
a specification. 

The normal behavior of the operations is specified as a restricted set of formulas 
of first order multi-sorted predicate calculus with identity. A typical formula is a 
conditional equation relating different sequences of operations under a condition. A 
specification can use a finite set of auxiliary functions so mat any data type with a finite set 
of total deterministic computable" operations can ^e specified m this way [45J* A 
nondetermmistic operation is specified Me a deterministic operatten by expressing the 
properties of its possible results on an input rather than byexplicitiy specifying its relation 
which holds for all possible results of the operation and the infrtit and does not hold for any 
other value and the input For example, in case of the data type finite set of integers, &e 
nondeterministic operation Choose is specified by relating its possible results to its set 
argument, instead of explicitly specifying its relation Choose_p : SeHnt x Int --> Bool 
which holds for a set and an integer if and only if Choose can return the integer when 
applied on the set 

The exceptional behavior of the operations is specified as a separate layer on top 
of the normal behavior. Following Guttag [31], if an operation signals an exception, we 



5. Burstall and Gogucn [7] and Nakajima ct aL {62] also emphasize the need for structured specifications. 



-15- 



specify the condition on its input under which the exception is signalled. The 
specification language provides mechanisms to specify the exceptions which must be 
signalled by the operations as well as the exceptions which the operations can optionally 
signal. The specification also allows a precondition on as operation to be specified, stating 
that the behavior of the operation on inputs not satisfying the precondition is not of aay 
interest A formula expressing the normal behavior of the operations holds only if the 
input to the operations in the formula satisfy the? specified preconditions and if tiie 
operations do not signal; it thus has a restricted inteipretotioa. A formula specifying the 
normal behavior is called an axiom. The preconditions and the exceptional behavior of the 
operations is specified using restrictions. 

Our approach of specifying data types is thus different from those of ZiHes [77J 
and the ADJ group [23J. In their approaches, a specification of a data type is a finite set of 
identities (or conditional identities) presenting the set Of algebras serving as the definition 
of a data type. These identities are mterpseted exactly the same way as in Universal 
Algebra J4,10J. We are also not constrained to employ only "equationa!" reasoning; 
instead, our reasoning method embodies the generafcproperties of data types as is discussed 

later. 

The semantics of a properly designed specification is a set of related data types 
which differ in the behavior intentionally not captured by the specification. If an operation 
is specified to be nondeterministie, the semantics of a spedfication includes data types in 
which that operation can have as much nondeterminism as desired insofar as the operation 
behavior satisfies the axioms and restrictions expfessed in the specification. We define 
equivalence among specifications. We also state when a data type can be (precisely) 
specified in the proposed specification language. We define two important properties of a 
specification: The consistency property, which states whether a specification specifies any 
data type; the behavioral completeness property, which guarantees that the observable 
behavior of the operations is not left unintentionally unspecified. These properties ensure 



6. However, this way of specifying the exceptional behavior of the operations may be overly restrictive, as for 
an operation, the subset of inputs on which it signals a particular exccptiojwnay be very complex to specify. 



16 



that various components of a specification have the desired structure. Checking for these 
properties is a step towards ensuring that the specification captures the intuition of a 
designer. 

In our research, a clear distinction is made between a data type and its 
specification. In most of the literature on specification techniques for data types 
[47, 25, 28, 29, 61, 77, 48, 37}, this distinction is, either not made or blurred if it is implied. 
Most of the literature does not explicitly define what a data type is. The ADJ group. [23] 
was the first to our knowledge to explicitly state in their formalism a definition of a data 
type and make this distinction. We believe the distinction between a data type and its 
specification is useful and necessary in a formal treatment of data types. Given a definition 
of a data type, different specification techniques can be developed to serve different 
purposes, if needed, and their semantics can be given in terms of data types. Different 
methods of reasoning about a data type can be developed] incorporating the general 
properties of data types with the definition of a data type serving as their basis. The 
question of whether a given data type can fee specified using a particular specification 
technique can arise only when tiiis distinction is made; only then can different specification 
techniques be compared in their expressive power. Only then it is meaningful to discus 
the properties of a specification technique such as the ease of expression, 
comprehensibility, minimality, etc., [47]. (See [34} for a simiiar discussion for programs.) 

A specification plays an important rote in our research. It is used as a standard for 
checking the correctness of an implementation as well as for deriving properties of the data 
types specified as is discussed in tiie next two subsections. It m an interface between the 
programs using the data type and the program^ implementing the data type. The 
specifications of abstract data types are a major component of a program verification 
system. Our specification method can be used to specify the behavior of the data 
component of software designs; questions and inquiries about the data in a design can be 
expressed and analyzed using the deductive system discussed in die next subsection. (See 
the two survey papers on specification methods [47, 48], where the need for writing formal 
specifications is discussed. Guttag and Horning [32] discuss the importance of formal 
specifications as a design tool) 



17 



1 .1 .4 Deductive System 

As was stated earlier, one of the main reasons for designing a specification is to 
have an implementation independent description of the data type that can be used to 
reason about the data type as well as to reason about the designs and programs using the 
data type. We propose a deductive system based on first order multisorted predicate 
calculus with identity for deriving properties of a data type from its specification. The 
deductive system embodies the general properties of data types which are not explicitly 
stated in a specification but assumed in its semantics. These properties are derived from 
the syntactic structure of the operations. 

The deductive system has an infinite rule which captures the minimality property 
of data types. The deductive system is powerful enough to prove inequalities. We 
axiomatize the general properties of the exceptional behavior of the operations. Properties 
expressed using nondeterministic operations can be proved. We construct a theory of a 
data type, which is a large subset of its first order properties, from its specification. If a 
specification specifies a set of related data types, every theorem in the theory constructed 
from the specification holds for each data type in the set 

We define three other structural properties of a specification, namely, sufficient 
completeness, well defmedness, and completeness, based on what properties of a data type 
can be deduced from its specification using different fragments of the deductive system. 
We precisely state the sufficient completeness property defined by Guttag and Horning 
[28] for a restricted set of specifications and extend it to specifications in our specification 
language. This property requires that the behavior of the observers on their intended 
inputs can be completely determined from the specification by purely equational 
reasoning. We relate this property to the behavioral completeness property stated in the 
previous subsection, which is model theoretic and which requires that the specification 
completely specify the behavior of the observers on intended inputs. Recall that the 
behavioral completeness property does not say anything about what can be deduced from 
the specification. In this sense, the relation between behavioral completeness and sufficient 
completeness reflects the power of the equational fragment of the deductive system. 

The well defmedness property is stronger than the consistency property, because 



-ra- 



the well definedness property not only requires that a specification specifies at least one 
data type, but also that it (specification) is modular in the sense that it preserves the 
specifications of other data types used in it 

The completeness property is stronger than the sufficient completeness property, 
since in addition to the requirement that the behavior of the observers can be deduced on 
any intended input by equational reasoning, it also requires that the equivalence of the 
observable effect of the constructors on intended inputs can be deduced from the 
specification by equational reasoning. 

1.1.5 Correctness of Implementation 

We state the correctness criterion for an implementation coded ki a programming 
language with respect to a specification as a relation between the semantics of the 
implementation and the semantics of the specification. Roughly speaking, a correct 
implementation implements one of the data types in the semantics of a specification. Our 
correctness criterion is weak as rt does not require a correct implementation to have *he 
maximum amount of nondeterminism specified by a specification. 

We develop a method for proving correctness of an implementation with respect 
to a specification which embodies the correctness criterion. The method requires, among 
other things, that the procedures implementing the operations satisfy the axioms and 
restrictions in the specification whei> appropriately interpreted We thus provide the 
formal basis of the correctness method proposed by Guttag et ak {29] and extend it to 
specifications specifying nondetermuietic operations and operations exhibiting exceptional 
behavior. 

We distinguish among different procedures implementing an operation specified 
tote nondeterminBtic, since the nondetermmistic behavior of an operation on abstract 
values can be implemented by a deterministic procedure .on the representation of these 
abstract, values mat returns different results on different but equivalent representations. 
We can a procedure nondelerministic (respectively, deJerminisiic) \f it is norwkterministic 
(respectively, deterministic) and it returns equivalent results on equivalent representations. 
Otherwise, if a procedure returns difTeremresuks on equw^ it is 



19 



called pseudo-nondeterministic irrespective of whether it is deterministic or 
nondeterministic on the representations. We discuss the correctness method for these three 
kinds of procedures implementing an operation specified to be nondeterministic. 

1.2 Related Work 

In this section, we discuss different definitional and specification methods for data 
types, briefly stating the major differences as well as the main thrust of these works. The 
detailed comparison of these works with ours is contained in the rest of the thesis where we 

discuss various topics. 

The definitional methods for data types can be broadly classified as (i) the 
algebraic or model approach, and (ii)the axiomatic approach. In the model approach, a 
data type is defined as an algebra satisfying certain properties, or as a set of such algebras. 
ADJ [23] defines a data type in this way. Though Hoare [37], Zilles [77], Guttag [28], and 
Berzins [3] do not explicitly define what a data type is, their approaches suggest that a data 
type is defined usmg the model approach. Our approach is also the model approach. 

Nakajima et al. [62] take the axiomatic approach; they define a data type as a first 
order multi-sorted theory. Recently Nourani [63] has also discussed the use of a first order 
theory for defining a data type. Though this view of a data type is useful in program 
verification, there is no explicit model of a data type to match with the intuition of a 
designer of the data type. If a first order theory is interpreted as in Logic [16] and its 
models are taken as the modelsof a data type being defined, then there are nonstandard 
models for a data type, which are of no relevance to its designer. A nonstandard model 
does not satisfy the minimality property of data types discussed in the next chapter. Hoare 
[38, 39] has also used the axiomatic approach for defining a class of data types. 

A survey of specification techniques for data types can be found in [47] and [48]. 
The specification techniques can be broadly classified into three categories based on their 
approach: (i) the model approach, (ii) the algebraic approach, and (ii) the axiomatic 
approach. The model approach is used only in case a data type is defined using .the model 
approach. A data type is specified by presenting one of its models. Berzins [3] has 
formalized and extended the model approach originally proposed by Hoare [37]. He has 



20 



also related his research to other works following the model approach. We discuss here the 
algebraic and axiomatic approaches. 

The algebraic approach has been proposed by Zjlle${77J and the ADJ group {23]; 
in this approach, a set of algebras defining a data type is presented as a finite set of 
identities or conditional identities. Burst all and Goguen [7] and Gogoeri [20f specify a date 
type as an algebraic theory. 

The axiomatic approach for specifying a data type can be used for either of the 
two definitional approaches discussed above. If a data type is defined using the model 
approach, a specification using the axiomatic approach consists of the properties of the 
models of a data type. Otherwise, a specification consists of a subset of the theory serving 
as die definition of the data type. The axiomatic approach followed by Nakajima et ah, 
Hoare [38, 39], and Standish [69] uses the full first order predicate calculus to -specify data 
types. The approach advocated by Guttag et al. uses a restricted set of formulas, namely 
equations and conditional equations. 

Our approach is also axiomatic. A specification expresses the normal behavior of 
a data type(s) (which is a set of algebras) as equations and conditional equations, and its 
exceptional behavior as restrictions. As is stated in the previous section, these formulas are 
interpreted using die restrictions in a different way than in the algebraic approach. In 
contrast to the specification methods proposed by Nakajima et al., Hoare, and Standish, the 
general properties of data types are not explicitly stated in our method/ A specification 
provides an incomplete (in the sense of Logic) first order axiomatization of the data types 
being specified. From a properjy designed specification, it is possible to derive most of the 
interesting properties of a data type needed in program verification. 

The major focus of Zilles' work and the ADJ group's work has been to extend the 
theory of heterogeneous algebras to capture the meaning of data types. They have hot 
investigated how to use the definition of a data type for proving properties of programs 
using data types, Zilles [76] has suggested an ad hoc method for establishing correctness of 
an implementation of a data type; however, the method as well as its foundation have not 

- . "• ■■«,- . ; 

' ' - '" - ■' -■ .*■.■■'■;■■. i V'.' ■=;= ; - ,*' . --'■■ %\\' :.',"•;• „ ■ ' \ 

been fully developed. The ADJ group and Ehrig'et ali [15] have proposed an algebraic 
approach for establishing the correctness of an implementation of a data type in which fhfey 



2i 



have attempted to incorporate the algebraic semantics Of the: control structures of the 
programming language used for the implementation. Although the ADJ group's work is 
rigorous, there are two main problems with it: 

(i) it has not embodied the view of data types taken in programming languages, and is 
thus useful only for a small Set of data types, and 

(ii) it is complex. 
The approach taken by Burstall and Goguen [7] seems more promising than the ADJ 
group's approach from the viewpoint of program verification, but, we have been told, its 
category theoretic semantics again seems to introduce unnecessary complexity [30]. 

Guttag et al. have focused on using specifications for proving properties of data 
types and programs using data types. The nice aspect of their approach is that H captures 
the view of a data type taken in programming languages. CDtfr research formalizes, provides 
a mathematical basis for, and extends their approach. 

The ADJ group [23] has been the first to investigate rigorously the exceptional 
behavior of a data type. In their method, the set of values of every data type is extended to 
include a distinguished value, called error. Using special auxiliary functions which test 
whether an arbitrary value is an error, they specify the exceptional and normal behavior of 
a data type. Goguen [20] has enriched and structured their approach: Our approach is 
based on Guttag's recent suggestions for separating the exceptional behavior of a data type 
from its normal behavior [31]. 



22 



1.3 Outline of the Thesis 

The second chapter introduces a formalism for defining a data type. We first 
discuss the formalism for data types assuming that the operations do not signal exceptions. 
Later, we extend the formalism to incorporate the exceptional behavior of the operations. 

The third chapter describes the specification language, gives its semantics, and 
defines the consistency and behavioral completeness properties of a specification. 

The fourth chapter discusses the deductive system. We discuss how a theory of a 
set of data types serving as the semantics of a specification can be instructed from the 
specification. We first describe the deductive system for specifications specifying neither 
nandeterministic operations nor the exceptional behavior of the operations; later, we 
discuss specifications specifying the exceptional behavior of the operations, and finally, we 
incorporate nondeterminism. We discuss the deductive system incrementally introducing 
its various components; we first discuss the equational theory, then the distinguishability 
theory, later the inductive theory, and finally, the full theory. 

The ftfta chapter discusses a correctness criterion for an implementation with 
respect to a specification and a methodology embodying the criterion. The correctness pf 
recursive and mutually recursive implementations is also briefly discussed. 

The sixth chapter presents conclusions and directions for future research. 



23- 



2. Definition of an Abstract Data Type 

In this chapter, we develop a formalism to define an abstract data type. We take a 
behavioral vie* for defining a data type in whkh eyesy value oCthe datatype is constructed 
by finitely many applications of its constructors and these values are distinguishable only 
by means of its operations. We adopt the modelapproach: A da|a type is defined* to be a 
set of behaviorally equivalent type algebras, where a type algebra,, is an extended 
heterogeneous algebra with additional properties needed to, model data types. The 
syntactic structure of a data type determines the strjue^e $t#pe algebras in the set Every 
type algebra in the set is called a model of the dala^fe, A model provides an explicit 
meaning (interpretation) for the values and toe operations of a data type; in this way, it 
captures concretely the informal description of apaia type in our mind, The model 
approach for defining a data type: is closer to Jtu^ injuitioi) of a programmer than the 
axiomatic approach as in |62, 63], where a data type is i^fmed as a first order theory. 

The crucial concept in the definition of a data type is that of behavioral 
equivalence of type algebras. The definition of behavioral* equivalence captures the 
informal notion that two behaviorally equivalent type algebra^ have the same behavior as 
observed through their operations We are interested in how the interpretations of the 
values and the operations of a data type in a model behave, and ijpt in how #iey are 
represented. We have decided not to pick a particular model, to be the definition pf a data 
type because we do not want the irrelevant details of the model to J^ assxiqiai£4 #h $e 
data type. We have only considered the inputTOUtput behavior pf the operations of a data 

type. 

Behavioral equivalence abstracts from (i) multiple replantations of a value for 
a representational structure as well as from <ii) the reo^e$enta|iQnal structure of the values 
in an algebra. Thus type algebras differing only in the representational structure of their 
values are behaviorally equivalent; fuithefmore, type algebras using the same 
representational structure but differing in the number of representations a value has are 
also behaviorally equivalent The property (i) above is achieved by defining a congruence, 
called the observable equivalence relation, on a type algebra, and the property (ii) is 



-24 



achieved by the standard algebraic concept of isomorphism. Tlie distrngu^liability 
relation, which is the complement of the observable equivalence relation, on the 
representations of the values of the data type is defined inductively in terms of the 
distinguishabilteyofthe representations of the valued of tf^itefimng types of the date tjrpe- 
(The basis of this induction is any data type witft ttodefining types, and in particular, the 
data type boolean whose two values, true and jS/se, are assumed to be distinguishable.) 
Two representations are distinguishable if and only if there is a sequence of operations 
having an observer as the outermost operation, that produces distinguishable results when 
applied separately on the representations. 

1 f the operations of a data type signal exceptions, then two representations can 
also be distinguished due to the exceptional behavior of the operations. If a sequence of 
operations signals on a representation and does not signal on the other, or if it signals 
different exceptions on the two representations, then they are dtstinguishabk. 

■file model used for nemdeterminism is simple, rf a nondeterministic operation 
behaves nondeterministfeally on an ffrput 6e., it has a ehdiee to return one of the many 
possible results), we expect It to return every possible result. Wedo not consider how these 
results are scheduled by an impferrientation of the operation. Two operations having 
different amounts of nondeterminism are considered to have different observable behavior 
because for some input, they can always return distinguishable results. The definition of 
distinguishability relation on representations ofthe values of a data type incorporates this 
view of norideterminism. 

In the first section, we mtroduce terminology, define hierarchically structured 
data types, and informally discuss the minimality property of a data type. We assume data 
types to be hierarchically structured and defined one at * &ne. There are however no 
technical problems m our formalism in handling mutually recursive data types which are 
not defined separately. We outline the simple' extensions of the formalism to such data 
types in the last section of the chapter. Until the £oiht where we define a data type, we 
have used the notion of a data type m ah informal way to motivate the formalism 
developed. 

In the second section, we first introduce the formalism for defining a data type 



25 



assuming that its operations do not signal exceptions. Our definitional method is 
hierarchical; we assume that the definitions of the defining types are given. We motivate 
and discuss in detail the distinguishability relation on the representations of the values. We 
then precisely define the behavioral equivalence relation on type algebras. 

In the third section, we incorporate the exceptional behavior of a data type and 
discuss extensions to the formalism introduced in the second section. We extend a type 
algebra and the behavioral equivalence relation on type algebras to capture the normal as 
well as the exceptional behavior of the operations. We compare our approach with 
Goguen's approach of modeling the exceptional behavior {20, 21]. We also formalize a 
simpler approach for modeling the exceptional behavior which has,been generally assumed 
in the literature on algebraic specification of data types [25,27,77,23}. We compare our 
definition of a data type with the definition used by the ADJ group [23] which abstracts 
only from the representation structure of the values in a type algebra. 



-26- 



2.1 Informal Description of a Data Type 

We use the data type finite set of integers for illustration; let SeMnt be its name. 
Set-Int has been widely discussed in the literature [37, 76, 74, 31]. It has the following 
operations: 

Null a constant (or 0-ary operation) returning the empty set of integers; 

Insert constructs a finite set of integers by adding a given integer to a given finite set of 

integers; 
Remove constructs a finite set of integers by deleting a given integer from a given finite set of 

integers; 
Has checks whether a given integer is an clement of a given finite set of integers; 
Sue results in an integer giving the size of a given finite set of integers 

In addition, we assume that SeMnt has an additional operation Choose, which has 
non-deterministic behavior. Choose returns an arbitrary element "of a given non-empty set 
of integers; for the time being, we arbitrarily assume that Choose returns the integer '0' for 
the empty set This behavior of Choose for the empty set may not be adequate for some 
applications. In Section 2.3, we modify Choose so that it signals an exception for the empty 
set 

2.1.1 Terminology 

To simplify the mathematics, we assume that an operation has a cartesian product 
(possibly empty) of data types as its domain and a single data type as its range. An 
operation having a cartesian product of n data types (n > 1) as its range can be viewed in 
one of the following two ways depending on whichever is more convenient: (i) The 
operation is modeled as a family of n operations, each having the same domain as the 
original operation and a different type in the cartesian product as the range, or (ii) the 
cartesian product is viewed as a single type. We use the first method in the thesis. 

Let D be the name of a new data type being defined, and Q be the finite set of 
symbols naming its operations. Let A' stand for the set of names of data types appearing 
either as a component of the domain or as the range of an operation in 0. Let A be 



27 



A' - { D j. 1 D is the defined type and every data type in A is a defining type of D. 

In order to include the syntactic specification (i.e., the domain and range 
specifications) of the operations, we index every operation o in a by a pair (d, r), where d is 
a string made from the alphabet A' and r is an element of A', d specifies the domain of a 
and r specifies its range. 

Let Int stand for the data type integers and Bool stand for the data type boolean. 
For Set-Int, A = { Int, Bool }, A' = { Int, Bool, Set-Int } and 

= { Null, Insert, Remove, Has, Size, Choose }. The index of Insert for example is 

(Set-Int • Int, Int). 

As is discussed in the first chapter, the operations of D can be classified: as 
constructors and observers. Let Q c be the subset of Q consisting of all constructors of D 
(recall that a constructor is an operation having D as its range). For example, Null, Insert, 
and Remove are the constructors of Set-Int. The constructors construct all the values of D. 
Some constructors construct a value of D using only the values of the defining types of D. 
We call such a constructor a baste constructor. For example, Nail is a basic constructor of 
Set-Int. Every data type is required to have at least ©ne basic constructor; otherwise, will 
not have any values. 

Let be the subset of Q consisting of all observers of D. An observer examines 
the values of D; it takes at least one argument of type D, and returns a value of a defining 
type of D. For example. Has, Siie, and Ctoose are the observers of SeMht. Every 
interesting data type must have at least one observer, otherwise tfiere is no way to 
distinguish among different values of D [25] other than by the operations signalling on the 
values. An observer is also called an inquiry operational]. 

We thus assume that every operation of B either results in a value of D, or takes 
an argument of type D, or both. We consider a data type having an operation not satisfying 
this requirement to be not properly designed, because the behavior of such an operation 
does not depend on the data type. 



1. Henceforth we will not distinguish between a data type and its name, and between an operation and its 
name, unless needed. 



-28 



Let O n£ j stand for the set of nondeterministic operations of D. We allow any kind 
of operation, air observer or a constructor, to be nondeterministic. In our experience, 
however, we have found that a nondeterministic operation is often an observer. 2 

2.1.2 Hierarchical Structure 

We define the following two relations on a set of data types for capturing the 
dependency structure among the data types: 

Def. 2.1 D directly depends on every D' € A, and does not directly depend on any other data 
type. * 

Det. 2.2 D depends on D' if (i) D directly depends on D', or (il) there is a D" such that D 
directly depends on D" and D" depends on D'. I 

Hie direct dependency relation captures one level of hierarchical dependency. The 
dependency relation is the transitive closure of tht direct dependency relation. We define 

(D) + ={ D' | D depends on C }, and 

<D)* = (D)+u{D}. 
If data types are designed so that every data type on which D depends is assumed to be 
designed independently of D, then the dependency relation on (D)"*" will not nave any 
cycles and is a strict partial order on data types. In such a case, data types are said to be 
hierarchically structured, and they can be defined incrementally one at a time. Data types 
on which D depends do not have to be designed in any particular order relative to D; any 
approach, for example top-down, bottoiiMip>, etc., is compatible. Unless stated otherwise, 
we assume m the thesis that data types are hierarchically structured. 

We assume that the partial order induced by the dependency relation on the set of 
hierarchically structured data types has finite descending chains. The bottom of every 



2. In case a constructor a is nondeterministic, a is usually derived with respect to a subset Q_ of deterministic 
constructors (GL Q Q c ) m tile sense that a does not return any value that cannot be constructed using die 
constructors in Q R . 



29 



chain is a data type having no defining type. Throughout this thesis, we assume that the 
data type boolean does not have any defining type; Bcoi serves as the bottom element of 
the chains in the partial ordering for all interesting data types as will be clear from the 
discussion in Section 2.2. (The definition of Bad is given in Section 2.2.) We will often use 
the structure induced by the dependency relation on the set of data types for inductively 
defining properties of data types, as well as for fHDvinf properties about data types. Bool 
will often serve as the basis step of such definitions and proofs (in general, data types 
having no defining type serve as the bam). 

2.1.3 Minimality Property 

The requirement on a data type behavior imposed because of the modularity and 
good program design considerations that its values be manipulated only by its operations 
translates to requiring that its values be constructed only by its constructors, possibly using 
abstractly the values of its defining types. Furthermore in a computer the values can be 
constructed only by a finite sequence of operations, so the values of a data type constitute 
the smallest set closed under finitely many applications of its constructors. We call this 
property of a data type the minimality property. 

We require that every data type under consideration satisfy the minimality 
property. This requirement constrains the implementations of a data type to be protected in 
Morris's sense [59]. An implementation of a data type defined in a strongly typed language 
that hides the representation of its (data type) values from its users by providing an 
encapsulation mechanism, as in CLU, ALPHARD, etc., is protected. The minimality 
requirement does not rule out data types defining 'infinite' values, insofar as these values 
can be finitely described 



3. For example, we can define a data type infinite sequence of squares, **ose values are infinite sequence* of 
consecutive squares starting from n 2 , for every n > 0. It has a constructor. Cons, which takes a natural 
number as an argument aid wtuiaiE an infinite sequence; In addition, it haiihtcc observers - Wfi& which 
gives the first clement in Ac sequence; Rest, wmeh-giverthc remaining jcqucnc&after stripping the first 
sequence; and, Kqual, whichdiccks «*eACTte^mfinite sequences are equal or not. 



-30 



The minimality property serves as the basis of a powerful //u/ac/ /on rule for a data 
type D: To prove that a property P holds for D, ie>, for all values «f D, we need to show 
that P is preserved by every constructor of D. WegJ^eitiand,Spitzenr[72J catted this 
generator induction', Guttag et al. |27J; ealkM it data type induction. We discuss this 
induction rule in detail in Chapter 4 on the deductive system for data types. 

Since every operation of i* is assumed Id be computable, it can be easily shown by 
induction on data types, that the setof values of,D ? is recursively enumerable. 4 Tms is 
based on the fact that the set of sequences of constoietors is recursive. This thesis considers 
data types with a recursively enumerable set of values and a finite set of total computable 
operations. ^ is^ 



4. A setS is murti* iff ta characteristic function w*Mdicheete;wtaah«*»v!eft dement is a member of S 
or not, is total computable. Km&mtmtm^aiufimtide^ 
function. In other words, a»r^e.«!tS can betted by atdtati 



-31- 



2.2 Formalism 

In this section, we describe the formalism to state precisely what a data type is. To 
simplify the presentation, we assume that data types do not have any exceptional behavior, 
i.e., their operations do not signal any exceptions. Every operation terminates normally on 
every input in its domain. 

This section is organized as follows. We first extend the notion of a 
heterogeneous algebra as defined in [4] to model nondeterminism; then we define a type 
algebra to be an extended heterogeneous algebra with additional properties. The domain 
corresponding to the defined type D consists of the representations of the values of D and 
is called the principal domain of the type algebra. To extract the behavior of a type algebra 
as observed through its operations, we must 

(i) abstract from the multiple representations of a value, assuming a particular 
representational structure, and 

(ii) abstract from the representation structure of the values and operations in a type 

algebra. 

To do the first, we define an interpretation of a term in a type algebra, where a term 
expresses a sequence of operations. Terms are used to observe the .behavior of the 
representations of the values of the defined type in a type algebra in terms of the 
representations of the values of the defining types. We define the observable equivalence 
and distinguishability relations on the principal domain of a type algebra. These relations 
are defined inductively using the corresponding relations on the domains corresponding to 
the defining types in the type algebra. Observable eqUivatence is an equivalence relation 
and is preserved by the functions in a type algebra; it relates two values having the same 
behavior. We then define the behavioral equivalence relation on type algebras which relates 
two type algebras having the^ame observable behavior. A data type isan equivalence class 
defined by the behavioral equivalence relation, ahd every type algebra in the equivalence 
class is a model of the data type. A model of a data type concretely defines the value set, 
which is the principal domain of the model, and the operations of the data type. 

Most of the definitions throughout this section are inductive; they make use of 



32 



the dependency relation, which is a strict partial order with finite descending -chains* on 

hierarchically structured data types. An inductive definition of a concept has three parts: 
(i) Basis part, which deals with the case of a data type D having no defining type, i.e., its A 

is the null set, 
(ii) inductive part, which deals with the case of a data type having defining types, and 
(iii) closure part, which states that the above two ways are the only ways of defining a 

concept 

To avoid repetition, we omit the closure part, and if the basis part can be derived from the 
inductive part by assuming A to be the null set, we give only the inductive part of the 
definition. Some of the definitions -the definitions of type algebra (Def. 2.3), 
distinguishability and observable equivalence relations (DefsJ!4 and 2.7) and (Jata type 
(Def. 2.14 ) are mutually recursive. The definitions 13, Ika^cJ 2.7 assume the definitions 
of the defining types in A in their inductive part. 

We would like to motivate various concepts and dejjinitioas introduced on type 
algebras. So for exposition purposes, we may refer to a type algebra as though it is a model 
of a data type being discussed. 5 

2.2.1 Type Algebras 

A heterogeneous algebra as defined by Birkhoff and iipson |4J is a finite indexed 
set of sets (called domains in the thesis) and a finite indexed set of total functions. We 
extend this definition to model the nondetennwistk operations of a data type. An 
extended heterogeneous algebra can have either a total (deterministic) function or a total 
nondeterministic function. 

A nondeterministic function f : X -♦ Vs similar to a function in mathematics with 
the exception that it has a choice among a subset of possible results when applied on an 
input x € X. Let f(x)stand lor an arbitrary resufe of applying f on x (can be characterized 



S. We arc technically justified to do so as almost every type algebra is a mode! of some data type. 



-33- 



using a relation RcXxY such that l(x) € R(jf). 6 If R(x)is a singleton set for some input 
x, then f is said to be deterministic on x. By { fl(x) } we will mean the set R{x); in this way 
we do not have to refer to R. Since we assume every nondeterministic operation to have 
finitely many choices on a particular input { <(*)} is always a finite set. We admit that 
calling fa nondeterministic function is an abuse of the term function; however, we feel this 
term conveys the behavior of f well. Henceforth, by the term function we mean either a 
mathematical (deterministic) function or a nondetermtnistic function, unless qualified. We 
have chosen a nondeterministic function over the corresponding deterministic relation for 
modeling a nondeterministic operation because*(i)m contrast to the nondetermiriistic 
function, the relation models the nondeterministic operation indirectly, and (ii) it is 
inconvenient and unnatural to express the behavior of a computation scheme involving 
nondeterministic operations by means of the relations corresponding to the 
nondeterministic operations. 

The definitions of concepts such as congruence, homomorphism, isomorphism on 
heterogeneous algebras (4] are revised for extended heterogeneous algebras in Appendix II. 
Henceforth, we use the term heterogeneous algebra to mean an extended heterogeneous 
algebra. 

A type algebra is a heterogeneous algebra with additional properties. For a data 
type D, we are interested in type algebras having a particular structure, which is determined 
by A' and of D. The sets A' and serve as the index sets of the type algebras of interest 
for D. We call such an algebra as an algebra of typeB ©f simply a type algebra when D is 
evident from the context The triple (D> A,0) is called the (similarity) type, of such an 
algebra. An algebra A of type D consists Of a domain corresponding to every type name 
D' € A' and a function of the appropriate arity corresponding to every operation name in 0. 
The domain corresponding to D is the principal domain of A. The function corresponding 
to a is called the interpretation of the operation symbot a in A. The domain corresponding 
to a defining type D' € A is the interpretation of D'. 



6. For a relation R, a subset of X X Y, R(jr) stands for the subset {y\<x,y> € R } of Y for an x € X, and 
R(A) stands for { y \ <x. y> € R, x € A }, where A Q X. 



-34 



We assume that every defining type Q* in A of J>Js defined elsewhere and we are 
given the models of D' (see Subsection 2.2,6 for the definition of a data type and a model of 
a data type). The interpretation of a data type D* € A in an algebra of type D is fixed. We 
use the models of each D' € A to define type algebras off). The domain corresponding to 
D' £ A in a type algebra A of D is, the value set.of 9' defined ky some model A' of D'. A 
type algebra A of D explicitly includes only the interpretations of the operation names of 
D, and does not include the interpretations of the operation names of any defining type D'. 
We assume that every operation name of a defining type $T has the same interpretation in 
A of D as its interpretation in the model A' of D'. In this way, we define the interpretation 

* 7 

of every operation name of adata type D " € (D) in a type algebra A of D. An algebra A 
of type D is thus really a huge structure having interpretations for every data type in (D) . 

Def. 2.3 An algebra A of type D is a heterogeneous algebra 

[{V D ,|D'€A}; {f U€O}J, 
such that ' 

(i) for every defining type 1* € A* V D v is the value set of D' defined by a 

model A' of D', 

00 for every * € 0, f is a total function of the appropriate arity, i.e., if * has 
DjX... xD q as hs domain and D' as its range, 8 then f a has 
V D x . . . x V^ as its domain and V D , as its range, and 

In 

(hi) V D is the smallest set closed unoVr fmitely many applications of the 
functions corresponding Ho the constructors of B^tc, : : 

V D = u V^, where V J = 13 and 

j - o 

V^ +1 = { f a (* Jt ". ; X , v n ) | for each a % C suchffiat 



a : D, x . . . x D - D, v . € U V* if D. = D, and v € V n if D * D }. 



I 



i 



7. Recall that (D) is the set consisting of D and all data typesoo which Q depend?. 

8. i.e^(Dj« ... • D n , D') is the index of a. .,.■..■?•„ 



35 



So, V D is the principal domain of A, f ff is the interpretation in A of the operation name 
a £ 8. We do not require the interpretation f of a to be a deterministic function if a is 
deterministic and f to be a nondeterministic function when a is nondeterministic; the 
reason for this will become clear in Subsections 2.2.5 and 2.2.6 on the behavioral 
equivalence of type algebras and the definition of a data type respectively. 

If any f in A is a nondeterministic function, then A is called a nondeterministic 

o 

type algebra; otherwise, if every f is deterministic, then A is called a BeterMtnistic type 

algebra. Henceforth, in the context of an algebra A of type D, ttylin operation a we mean 
its interpretation f and by a value of D we mean an element of V D . 

The property (iii) above is due to ! die requirement that D satisfies the minimality 
property. For a constructor a, if f ff is nondi&niiJjHstiCt tfiea V^ is closed under f ff 
assuming \ a could return any possible result for arj iapuv Once; *he value set corresponding 
to each defining type D' is fixed* tben ota^iptj&ly Y isnuniquely determined by 
{ f | a € a„ }, and is nonempty, because 8- ,i* nonempty and jhas at least one basic 
constructor (see Section 2.1). 

2.2.2 Examples of Type Algebras 

We discuss below a type algebra A gi of Set-Iht. A s| is a natural model of Set-Int 
in the sense that its principal" domain is the set of all finite sets of integers, and the 
interpretations of its operations are defined in terms of the standard set operations $6]. 



A 8 .=({S,Z,B};{Nu,In,R^Ha,Si,GI|}], j 
where B = [true*; false }, a value set of Bool, x > 

Z = {0, 1, -1, 2, -2, ... }, a valine set of lot, and .-, . ■ 
S = { 0,|O}, {1}, {-1}, {2}, {-2}, {0, lift -1}, {0, 2}, 

{0 ,-2}, {1, -1^ {1, 2J, .. . }, the domain^cflrrespojftdir^ toSet-Int. 

The domains Z and B are defined e!sewft€re by the modelS-Wf lit and Bool, respectively. 

The first two letters of an operation name are used to denote i*r'* sj the total 
function corresponding to the operation. These functions ate deTped below ■ '-We will use 
any convenient matWemafieal formalism to give the definitions of the functions. We use 



36 



the symbol ' =s ' as the definition symbol; the symbol *;* marks the beginning of a 
comment in a definition, running until the end of the line. 

Nu ^ 

In(s,i) £ su{i} 

Re(s, i) s s-{i} ;- is the difference operator 

Hafci) A i€s 

Si(s) = #(s) ; the cardinality of the set 

Ch(s) £(0 ifs = 

such that i € s, otherwise. 



'£ 



Ch is a nondeterministic total function; if s is not 0, then {Ch(s) } = s. 

We discuss another -'type algebra A^ of Set-lnt 4n which the set values are 
represented as finite sequences of nonrepeating integers. 

A^ = [{SQ\Z,B}; {Nu^lte^HaW.Ch 1 }}, 
where SQ 1 = {0,<0> 1 <l> 1 <-l> f <2>,<-2>,<0,l> 1 <fl,-l>^^t-2>, 

<1, 0>, <1, -1>, <1, 2>, <1, -2>, <-l, 0>, <-], 1>, <-l, 2>, <-l, -2>, 
<2, 0>, <2, 1>, .... }, the domain rorresponolrtg to Set-lnt. 
The set SQ 1 contains all finite sequences of integers not having multiple occurrences of the 
same integer, for example, <C0>, <0, 1, -1, l^are not in SQ 1 . Let s stand for an element of 
SQ 1 . So, s = </ r .... / k >, m £ 0; if m = 0, then s = O. 

lW A <>" 

In'K/, ./>,!> ^ ^. .;,/,> 3l^}^m^ = i 

otherwise 

Re 1 ^ ...,/_>, » £<*,,-, '£, 4^ /> a 1 £ j S m,^ = i 

otherwise 




i: 

Si^s) A m 



[ 



^ ]£j£81>0l 



- 37 - 

Ch 1 is a nondeterministic function; {Ch 1 ^,...,/^)} == l^ ••-*i m } formX). 
2.2.3 Interpretation of Terms 

A term is constructed using the operation names of types in (D) and the typed 
variables. It expresses a sequence of operations, so it forms a straight Bile' program. The 
interpretation of, a term in a type algebra is h£e the execution of such a program. The 
interpretation of aH terms characterizes the beriayiof of ^al|eb!^. 

We assume that we have as many variables (possibly infinite) of every type 
D'€(D) as needed. 

Per. 2.4 A term of type D' € (D) is defined inductively as follows: 

(i) A variable x f pftypeP' is a term of type D', 

(ii) if a is an operation of some type D" € (D) such that its domain is 

D, x ... x D and its range is 0;', then \«Ke* ••* ^O' : - fe a term of type 

D' if and only if each e. is a term of type D fe € (D) .1 

If a term has no variables, it is called a groundXsm. A term of type Bool is called a boolean 
term. When we wish to refer to the variables of e, we write e as e(x r . . . , x n > (or e(X)), 
where the set { x , . . . , * n } (or *)consists of all »ariables in e. A mburm of a term mat is 
a variable is the term itself. The subterms of a term of the form »(v- . . , ej are (i) the 
term 'o{e v .... ej itself, (ii) all subterms of e Y <? n , and nothing else. 

An interpretation of a ground term e in aaralgebra A of type D is obtained by 
performing the sequence of operations expressed by e. A grolwd term e ©f type ©'is 
interpreted in A as ft)llows: If e is a G-ary operation name kr, an interpretation of e is the 
result of applying the interpretation of a in A. If e is ' o{e i? . , . , e n ) .' an interpretation of e 
is the result of applying the interpretation of «r in 3 At ©n the interpretations of e v .. . > » \ in 
A. An interpretation of c is an element of ¥|y , Since e may be constructed using 
nondeterministic operation names, ecan have many interpretations. Let e\ A stand for an 
arbitrary interpretation of e in A. 

For example, let us assume that the defining type Int. of SeWnt has the 



38 



constructors 0, 1, 2, and 3, and that they have the standard interpretation in a mode! of Int. 
Then e = Inscrt(Insert(Null, 0), 1) and e 2 = Choose^) are ground terms of types SeMnt 
and Int respectively. We have, 
e iU '= {AD. and 

Since every operation name of a data type D* € (D) has a total function as its 
interpretation in an algebra A of type D, we have 

Prop. 11 Every ground term of type D' € (D) has an interpretation in A. ■ 

Furthermore, since every data type under consideration has the minimality property, we 
have 

Prop. 12 Every value in V D is an interpretation bf some gfouiid term of type P. 

Proof Straightforward, by induction on type algebras using thetfependency relation. I 

For a term e of type D' having variables, its interpretauon is a function, which is 
denoted by ( e . If e has nondeterministie operation > names, then f c fc in general a 
nondeterministic function. Let { x^ . . ,x n } be the only variables in «and D. be the type 
of x . Then f g has V D x . . . x V^ as ks dotnam and %& its range. If the variables 

1 B 

*,' . . . , x u in e are instantiated in A to be the wises v^ . . .', v h respectively* from the 
appropriate domains in A, then e(x r .. .,Jc n K is said to-be instantiated in A as 
efx/^, . * . , x n /v n l and can be interpreted in A, The assignment (j^/Vj, .... x/v n ] is 
called an A- instance of x„ . . . , x , and each v. is catted an instance of x . (We will 

In 1 ' 

abbreviate the assignment as [X/V], where K stands forOy; ... vj.} An interpretation of 
e\X/V\ in A, written as e\X/V§ A , is defined as follows: 

(i) If e is a variable x. , then efff/ff A * v r and 

(ii) if *isof the form * <rO r . • . , ej \pi £ r 

thend^/^MA = M e i^ /y MA'-' e «^ /K JlA>- 

f e (V)ise[X/n A - 

Interpreting a ground term or an instantiated term in A is thus like performing a 



-39- 

computation; an interpretation is the result of the computation. 
2.2.4 Observable Behavior 

The behavior of a sequence of operations of a data type D, strictly speaking, 
becomes externally observable if the sequence has an effect on the outside world, for 
example, the sequence of operations ultimately results in some output on an I/O device, 
such as a line printer, CRT, etc. In this sense, the distinction between two values of D is 
observable if and only if there exists a sequence of operations such that when applied on 
the values separately, it returns distinguishable outputs on an t/O device. An output on an 
I/O device can be considered as a sequence of characters, and we can have a predicate on 
the outputs, resulting in the boolean constants T and F depending upon whether the two 
given outputs are distinguishable or not. In this way, we can define the distinguishability of 
the values of D using the distinguishability of the boolean constants. We stop at Bool. As 
was stated earlier, we use the definition of Bool as the basis of our formalism. In fact, any 
data type (or a collection of data types) whose values can be distinguished a priori (outside 
the formalism) can be used as the basis. For instance, a data type directly supported in a 
programming language whose values are distinguishable using the literal mechanism in the 
programming language can be used 

We structure the above informal definition of distihguishabihty using the 
dependency relation on data types. Instead of defntmg the distinguisriability of the values 
of D in terms of the distinguishability of boolean values in a' angle step, we do it 
incrementally. We assume that the distinguishability reMttoh is defined on the values of 
every defining type D' € A, if any; in this way, the behavior of the values of D can be 
incrementally observed through its observers. Except for Bool, if D does not hive any 
observers, i.e., its Q Q is the empty set, then the values of D are not distinguishable, as there 
is no way to tell whether any two values are different That is why wHemarked earlier that 
every interesting data type must have at least one observer. 

For a D with a nonempty set of observers, it is generally not sufficient to examine 
the values of D directly by the observers due to the possible ddayeid effects of the 
constructors. The distinguishability ©f the values may not manifest itself until some 



-40- 



constructors are applied on them. For examples two4iflereirt nonespty stacks of the data 
type slack of integers may have the same integer as their top element, so they cannot be 
distinguished directly by the observer Top. But if we appry°me rop operation first on the 
two stacks, then the resulting stacks may be directly distinguishable by the observer Top 
thus exhibiting that the original stacks are also distinguishable,. , There is generally a need to 
perform a sequence of operations with an observer of D as the last operation in the 
sequence, to distinguish two values of D. 

Informally, two values of D are distinguishable if and only if either 

(i) there is a sequence of deterministic operations of D such that when it is applied on 
the two values assuming every other argument of the sequence fixed, it results in 
distinguishable values of some defining type D' € A, or 

(ii) there is a sequence including nondeterministic operations such that the result of 
applying it on a value for some choice made by the nondeterministic operations is 
distinguishable from the result of applying it on the other value no matter what choice is 
made by the nondeterministic operations. 

If two values are not distinguishable, they are called observably eqjuiyaJent. For better 
exposition, we have deliberately structured the definition of distinguisrjability into two 
cases, though the second case can be modified to cover the first case. The second case may 
appear to be a very strong requirement, but a sap!! amount of thinking shoul4 convince 
the reader that such is not the case, as we definitely do not want a value to be 
distinguishable from itself. Furthermore, observabie equivalence should be an equivalence 
relation and it must be preserved by the operations of the data type. We precisely state 
below these requirements in the context of a type algebra and illustrate fcem using 

examples. 

The operations of a data type must also preserve the observable equivalence 
relation on the values of every defining type in the sense that the operations cannot 
distinguish among the observably equivalent values of a cfc^jngJype. This requirement 
on the operation behavior is necessary because of the modubr structure of data types. A 
new data type should not impose any additional structure on tly? values of any of its 
defining data types. This property of a data type is guaranteed in all programming 



41 



languages supporting an abstract data type mechanism m wnich an implementation of a 
data type is hierarchically structured and the representation js hidden from the users of a 
datatype. 

We would like the type algebras to have, the above properties. Definition 2.3 of a 
type algebra does not guarantee them, so we p4|W;ad#i6aai s c<)nstraint on a type algebra. 
We first define the observable equivalence relation E w on, theS principal domain V D of a 
type algebra A; we will assume that the observable equivalence relation E D < on V D - in A is 
defined for each D' € A by a model A' of D' having % as jts principal domain. We show, 
that E D as defined below is an equivalence relation. Later we define a well formed type 
algebra whose functions preserve the set E = {E |)( |b f € A f ) of observable equivalence 
relations. Only the well formed type algebras are of interest for defining a data type. 

In the above discussion, we have only considered the in^it^outputfeehavior of the 
operations for distinguishing different values. We have not^ considered the efficiency of the 
operations. In case of nondeterministic operations, we have not considered how possible 
values that a nondeterministic operation can return on a particular input are scheduled. 
Our formalism is limited in this sense. 

2.2.4.1 Definitions of Observable Equivalence and Distinguishability 

We give the basis and the inductive parts of the inductive definition of the 
distinguishability relation. The basis pa^t is the case when D does not have any defining 
type and the inductive part is, the case when D has tfefimng tyoes. In the basis part, there 
are two subcases: (i) D is tfool, and (ii) D is different from Bool. We first define the data 
type Bool and then define the distingufehaWKty retetSsh on the ftfddefe of Bool. 

The data type Bool does not have any defining types and is self-contaftiecfc We 
present below a model of Bool and call it B, 

B = ( { (true, false } .,}; { T, ¥ , V, % A, =>, « } \: 

T ^ true 

F ^ false 

A 
~ true = false 



-42 



~ false « true 

true V true =s true 

true V false — true 

false V true = true 

false V false = false 

xAy & ~«~x)V(~y)) 

x*» y » (~x>Vy 

x**»y & Hvvy))V(xAy) 

The interpretation of T is the logical value true and the interpretation of F is the logical 
value false. 

Def. 15 The data type Bool is the set of all type algebras isomorphic to B. I 

We will often use B as if it is the only model of Bool, and interchange between T and its 
interpretation true in B as well as between F and its interpretation false. We assume that 
the boolean constants T and F are distinguishable from each other a priori, meaning that 
their interpretation in every model of Bool is distinguishable. Each boolean constant is 
observably equivalent to itself. 

Def. 16.1 Let A be a model of Bool and Vg. be the value set of Bool defined by A. The 
observable equivalence relation on Vg^j is defined to W the identity relation on Vg^. 
The distinguishability relation on Vg^ is defined to be the complement of the observable 
equivalence relation with respect to the universal relation on Vg. (i.e„ Vg^j x Vg^ ). I 

The other component of the basis part of the definition of distinguishability is 
now given. 

Def. 16.2 For any data type D other man Bool not having any defining type, no value in 
V D of an algebra A of type D is distinguishable frorn any other Value hi V D . I 

The inductive part is as follows: 



43 



Def. 16.3 Two values v L and v 2 in V D of a type algebra A are distinguishable iff ttiere is a 
term of type D' with exactly one variable of type D, expressed as c(x), such that the 
instantiation c[x/v) interprets in A to a value of a type D' € A (an element of V D -) that is 
distinguishable from every possible value to which the instantiation c[x/vj interprets, or 
vice versa. I 

The case 2.6.2 above can be derived from the case 16.3. 

Dcf. 17 v t and v 2 are observably equivalent, Le., (v r v 2 ) € Ej, iff v l and v 2 are not 
distinguishable. I 

It should also be obvious from the above definitions that if D does not have any observers 
and D is different from Bool, then all members of V are observably equivalent The 
following definitions are useful in dealing with data types having nondeterministie 
operations. 

Def. 18 Given two subsets A l and A 2 of V D , A x is observably equivalent to A 2 and vice 
versa, iff (v v l € \) (3 v 2 € A 2 ) [ <v v v 2 > € EpJ, and vice versa I 

Def. 19 Aj and A 2 are distinguishable iff A x and A 2 are notxsbservably equivalent I 

Then the case 16.3 can be rephrased as: 

v and v 2 are distinguishable iff there is a term d& such that { <{x/v)\ A } is 
distinguishable from { d[x/vJH A }. 

Consider the type algebra A 8i of Set-Int (see Subsection 2.2.2V If can be proVed 
using the definition of Int that the observable equivalence relation on ^, the value set .of tot 
used in A , , is the identMy relation. Then the sets {} and {0} are distinguishable since the 
term Size(jc) distinguishes them. The sets {CU} and Q, 2} are also pMstinguishable since 
the term Choose(jf) distinguishes them: An interpretation of Choosc({0, 1}) is either or 1, 
and if is chosen as an interpretation, there is no interpretation of Choose({l, 2}) returning 
0. By similar reasoning, {0, 1} is also distinguishable from {0}. {0, 1} is observably 
equivalent to itself. The observable equivalence relation on the principal domain of A 8| is 
the identity relation. However, it can be shown that the observable equivalence relation on 



-44- 



the principal domain of A], is not the identity relation, because for example, <1, 2> is 

SI 

observably equivalent to <2, 1>. In fact, any two sequences having the same set of integers 
are observably equivalent InA^ , 

i^Sct-lnt = { <sl, s2> | si is a permutation of s2 }. 

Thm. 2.1 The observable equivalence relation E^ is an equivalence relation. 

Proof That E D is reflexive and symmetric is obvious from the definition. The transitivity 
of Ejj can be shown by induction on type algebras using the dependency relation, f 

The requirement that the functions in a well formed type algebra A preserve the 
observable equivalence relation E D » for each D'CA' is equivalent to requiring that 
E - { Ejj. | D' € A' } be a congruence on A;, where a congruence on a heterogeneous 
algebra is defined in Appendix II. 

Def, 2.10 A type algebra A is well formed if and only if E is a congruence on A. I 

Since we are interested only in well formed type algebras, by a type algebra we henceforth 
mean a well formed type algebra unless staled otherwise. 

Forexample, both A^ and A* t are well formed. E 1 = { E^^ , Ej Bt v E Bool } 
in case of A^. , where Ej rt and E^j are the identity relation, is a congruence on A *j . 

Thm. 2,2 Assuming that Eg^ is the largest congruence on a model of Bool, E is the 
largest congruence on A. 

Proof See Appendix II. I 

The above theorem implies that the observable equivalence relations on the domains in A 
completely extract its observable behavior in die sense that in thfc quotient algebra A/E 
induced by E on A, every value is distinguishable from each odwr. 



45 



2.2.4.2 Reduced Algebras 

It is technically cumbersome to deal with a type algebra having distinct but 
observably equivalent values, so we introduce the notion of a reduced algebra. 

Def. 2.11 An algebra A of type D is called reduced'if and only if for each D' € A\ E D < is the 
identity relation. I 

So all members in every domain of a reduced type algebra are distinguishable. For 
example, A . is reduced, whereas A* is not. B, the model dfBobl, is also reduced. 

Given an algebra A, we can get its reduced algebra by taking the quotient of A 
w.r.t E = {EjylD'eA'J, since E is a congruence on A. the reduced algebra 

corresponding to A is 

A/E = [{V iy /E D ,|D'€A'};{g ff |a€Q}],where 

g a avj [vJ) = lV(v r ..v i )^ 

The principal domain of the reduced algebra corresponding to an algebra of D having no 
observers, where D is not Bool, will have a single element The reduced algebra 
corresponding to A^ has as its principal domain 
SQVEga.fot = { { O }, { <1> }, { <1> }, { <2> }, { <-2> }, 

{ <0, D, <1, 0> }, { <0, -1>, <-l, 0> }, ... "} . 

2.2.5 Behavioral Equivalence of Type Algebras 

As was stated at the beginning of this seeto,in order, to abstract the observable 
behavior of a type algebra, we must abstract from (i)»u4tiple representations of the values 
of a data type in the type algebra as well as from (ii) different ^eresentational structures 
used for the values in different type algebras. The observable equivalence relation 
discussed above does the first task. It identifies representations having the same observable 
behavior. For the second task, we employ the standard algebraic concept of isomorphism. 



9. It can be easily shown that A/E is also a type algebra. 



46- 



By combining the two, we define the behavioral equivalence relation on type algebras as 
follows: 

Def. 2.12 Type algebras A x and A 2 are behaviorally equivalent if and only if the reduced 
algebra corresponding to Aj is isomorphically equivalent to the reduced algebra 
corresponding to A 2 . I 

We later show that the above definition indeed captures the desired intuition that 
two behaviorally equivalent algebras have the same observable behavior. By this, we mean 
that an interpretation of a ground term p in one algebra behaves the same way as an 
interpretation of e in .the other algebra, when manipulated by th? operations. (Informally 
speaking, a computation results in equivalent values in two related type algebras.) 

The isomorphic equivalence of two type algebras is stronger than the 
isomorphism of the two type algebras if considered as they are.^ if D : does not have any 
defining type, then isomorphic equivalence is the sarhe as.the isomorphism. However, if 
two .type algebras are considered in the expanded form iji which they have a domain 
corresponding to every data type D" € (D) and a function corresponding to every 
operation of D", then isomorphic equivalence is same as isomorphism. Since we do not 
wish to carry all mis information in a, type, algebra and consider a type algebra in the 
expanded form, we assume that for each D' in A, the models of D' defining Vjy and Vjy as 
the value sets of D' are isomorphicalty equivalent and there isa byeetion * D - from V^ to 
V^. defined by the isomorphic equivalence relation. We thus do not use any arbitrary 
bfjeeticrt from Vjy in A x to Vjjy in A 2 to show isomorphic equivalence between A 2 and A r 
Instead, we build the bijections bottom up establishing correspondence between the values 
m the two algebras. The set { * D tD' € A } induces a Ejection * D from V^ to V^ so that 
4» ^r { * D , | D' € A' } is an isomorphism from AjtoA 2 . 



47 



Def. 113 Given two type algebras A 2 and A 2 such that for each D' € A, the models 
defining V,y and Vjy as the value sets of D' are isomorphically equivalent, which defines a 
bijection <t» D , : V,\.-» V^. , A x and A 2 are isomotptikdlly eqiiivalertt if and only if there is a 
bijection * D from V^ to V* such that * = { * D - 1 D' € A' } is an isomorphism fiom A x to 
A 2 . I 

Note that both A' and A 2 above are either deterministic of the corresponding functions in 
A and A. have the same amount of nondeterminism. 

For examples, the models of Bool are isomorphically equivalent The type 
algebras A . and A^ of Set-Int are behaviorally equivalent because A^ and A^ /E are 
isomorphically equivalent. We can define three other type algebras of Set-Int which are 
similar to A^. . The type algebras A*, , aJ, , and^, have sets rep^senfed by finite 
ordered sequences of nonrepeating integers, finite ordered sequences of repeating integers, 
and finite (unordered) sequences of repeating integers respectively; the definitions of 
various functions are appropriately given. It can be'shown that Jjie type algefiras A 8| , 
A^ , A* A* and A* are behaviorally equivalent 

.SI SI 81 »« * 

Note that two behaviorally equivalent type , algebras need not have the same 
amount of nondeterminism. In fact one could be deterministic whereas. the other could be 
nondeterministic because the possible results returned by a nondeterministic function on 
an input in such a nondeterministic algebra are observably equivalent 

From me definitions of isomorphic equivalence and behavioral equivalence, we 
have the following: 

Thm. 13 A x is isomorphically equivalent to A 2 -» A x is behaviorally equivalent to A r 

Proof Assume A l and A 2 are isomorphically equivalent Let E L and E 2 be the sets of 
observable equivalence relations on A x and A 2 respectively. Then, Aj/Ej and A 2 /E 2 can be 
shown to be isomorphicaljy equivalent. (By Theorem 2.2, E A is the largest congruence on 
A 2 and E 2 is the largest congruence on A 2 .) So, A l and A 2 are beftaviorafly equivalent f 



-48- 



Thm. 2.4 The behavioral equivalence relation on type algebras is an equivalence relation. 

Proof The reflexivity and symmetry property are obvious from the definition. The 
transitivity can be proved from the fact that composition of two isomorphisms is also, an 
isomorphism. I 

The behavioral equivalence of type algebras A 2 and A 2 can be expressed as 

I t 

\ \ 

i r 

i i 

*A >— - W 

♦ 
such that the above diagram commutes, i.e., 

♦ .Hj = H 2 *. (f) 

(The function! . g has the same behavior as applying g first and* then applying f on the 
result) E t and E 2 are congruences consisting of observable e^uftaleiiceirelations on A - and 
A 2 respectively; A/Ej and A 2 /E 2 are the reduced algebras corresponding to A 1 and A 2 
respectively; and, 4> is the isomorphism defined b^ the isomorphic equivalence of Aj/Ej 
and A 2 /E 2 . H 1 and H 2 are the homomorphisms induced by the congruences E 2 tin A^ and 
Ej on A 2 respectively. The equation (t) defines the set .f of many to many mappings 
{ *jy ; V^,-» Yp4,ET € A u { D }} relating A l and A 2 . In Appendix II, we discuss for two 
behayioraily equivalent type algebras Aj and A 2 , how a many to many mapping 
*D : ^r> ~* ^D can ^ constructe ^ from the set of many to many mappings { * tt # | D' € A }, 
where for each D' € A, *jy is a many to many mapping from V L to Vp. defined by 
behaviorally equivalent models Aj and A 2 of D' defining V D » and Vp. respectively. We 
also show that the above definition of behavioral equivalence indeed captures the desired 
property that the set of interpretations of a ground term are 'equivalent* in behaviorally 
equivalent type algebras. 



49- 



Ttun. 23 For behaviorally equivalent algebras A x and A 2 , for every ground term e of type 
D" e (D)*, for every v €,{ ej A }, there is a v' € { e| * } such that < [ v ], [ V }> € **', , and 
vice versa. 
Proof See Appendix II. I 

The following theorem expresses that the distinguishability and observable 
equivalence of ground terms are invariant over behaviorally equivalent type algebras. 

Thm. 2.6 For behaviorally equivalent A 2 and A 2 , for any ground terms e[ and e 2 of type 
V'Ale l \ k ]} = \[e l \ A n~{[e l \ k U = {[e l \ Ai ]\ [. 
Proof See Appendix II. I 

{[...]} stands for a set of equivalence classes. 

2.2.6 Definition of a Data Typo 

The behavioral equivalence relation on type algebras abstracts their observable 
behavior as shown abpve and captures the meaning of a data type. 

Def. 114 A data type D is an equivalence class of algebras of type D defined by the 
behavioral equivalence relation. I 

Let Mr, stand for the set of all behaviorally equivalent algebras of type D. Every 
A in My is called a model of D as we have captured the semantics of the operations of D. 
The principal domain of a model A defines a value set of D. If a model in D is a reduced 
algebra, then it is called a reduced model. Since isomorphically equivalent algebras have 
the same amount of nondeterminism, all reduced models of D are either deterministic or all 
are nondeterministic (see p. 47). If a reduced model in D is nondeterministic, then the 
interpretation of an operation in every reduced model has, informally speaking, the same 
amount of nondeterminism. When we wish to present a data type D, we will do so by 
presenting an element of M D as the representative of M^ . We call this model the 
denotation of D. We often use a reduced model as the denotation of a datatype. 

We can order algebras in My usiru» the onto homomorphism relation. Given two 



-50- 



algebras ^ and A 2 € M D , A 2 < A 2 if and only if Aj is an onto homomorphic image of A 2 , 
when A A and A 2 are considered in their expanded form. The relation < can be shown to be 
a partial order. A reduced model A of D is the least model in M D upto isomorphic 
equivalence. It is also called final in M D because there is a onto rK>niQinotphism from 
every model A' of D in M u to A as depicted in the following diagram. 




* = *.H' 



Def. 2.15 Set-Int is the set of all algebras behaviorally equivalent to A sj . I 

So, A . , Al. , A* ., A' , and A* are models of Set-fat) -It can be verified that all 

SI SI SI SI SI 

models of Bool are behaviorally equivalent type algebras of Bool. We will use B as the 
denotation of Bool and A g| as the denotation of Set-Tat 

It should be clear from the above definition flat a data type D not having any 
observers consists of all type algebras of J). This Mm because the definition of behavioral 
equivalence of type algebras depends only on the behavior.of the observers. 

We now compare our definition of a data type with those of Zilles [77] and the 
ADJ group [23]. They require a data type to be a set of afl isomorphic {isomorphically 
equivalent to be exact) type algebras, which abstracts only the representation details from 
the algebras. (They assume that a data type has only deterministic operations). In their 
approach, a data type whose models are the reduced algebras is distinct from another data 
type whose models have distinct observably equivalent values even though both data types 
have the same observable behavior. For example, trie data type consisting of models 
isomorphically equivalent to A^ would be different from the data type consisting of 
models isomorphically equivalent to A^ . From a programmer's point of view, both the 
data types are the same and cannot be distinguished. We do not understand the motivation 
for making the above distinction. Our definition of a data type is stronger than theirs, and 
it does not make the above distinction. It not only abstracts from the representations of the 



51 



values in a type algebra, but it also considers representations to be distinguishable only if 
they can be distinguished by the operations. It is based on the programming language view 
of a data type. 

2.2.7 Observable Equivalence and Distinguishability of Terms 

Since every value in the value set V D defined by a model A of D is an 
interpretation of some ground term of type D, the observable equivalence relation and 
distinguishability relation on V D induce the observable equivalence relation and 
distinguishability relation on the ground terms of type D as follows: 

Two ground terms e and e of type D are observably equivalent w.r.t. A if and only if the 
possible interpretations of e in A are observably equivalent to the possible interpretations of 
e in A. And, e and e 2 are distinguishable w.r.t. A iff they are not observably equivalent 
w.r.t A. 

For example, the ground terms Insert(Insert(Null, 2), 3) and Insert(Insert(Null, 1), 2) of 
type Set-Int are distinguishable w.r.t. A sj , as their interpretations {2, 3} and {1, 2} in A s| 
are distinguishable, whereas Insert(Insert(Null, 2), 3) and Insert(Insert(Null, 3), 2) are 
observably equivalent w.r.t. A gj , because they have the same interpretation in A sj . The 
observable equivalence and distinguishability relations on ground terms of D w.r.t A have 
the properties of the observable equivalence and distinguishability relations on V D in A; 
remarks and observations made in Subsection 2.2.4 hold for them also. 

Using the fact that all models of D are behaviorally equivalent and Theorem 2.6, 
it can be shown that every model of D induces the same observable equivalence relation on 
the ground terms of D. So we can say that the above relations are independent of a model 
and are relations on ground terms of D. We can use a reduced model to derive the 
observable equivalence relation on the ground terms of D. 

Distinguishability and observable equivalence of the ground terms of D are useful 
in understanding the behavior of D. These relations characterize the behavior of D in the 
same way as these relations on Uie values of a type algebra characterize the behavior of the 
type algebra. Distinguishability captures the informal notion of the ground terms being 



-52- 



unequal. The models of a data type also induce observable equivalence and 
distinguishability relations on ground terms of type D' € A involving the operations of D in 
the same way as above. Understanding of the observable equivalence relation on the 
ground terms is helpful in writing a specification of a data type, as discussed in the next 
chapter. A specification of a data type can be viewed as a way to describe the observable 
equivalence relations on ground terms. 

We can also define the observable equivalence relation on terms (possibly 
involving variables) as follows: 

Given terms e and e 2 of type D' € A', let X be the set of variables in e x and e 2 ; e 1 and e 2 
are observably equivalent if and only if for some A € M D , for every A-instance Vof X, the 
possible interpretations of e\X/V\ in A are observably equivalent to the possible 
interpretations of e 2 [X/V\ in A. And, e 1 and e 2 are distinguishable if and only if they are not 
observably equivalent 



53 



2.3 Exceptional Behavior of a Data Type 

So far we have assumed that every operation of a data type D returns a normal 
value of its range type for any input in its domain. This assumption is not realistic, as it 
glosses over an important component of the behavior of a data type. In this section, we 
discuss the exceptional behavior of a data type. We relax the constraint that every 
operation terminates normally: An operation can terminate either normally by returning a 
value or by signalling an exception. For example, we modify the behavior of the operation 
Choose on the empty set; henceforth, we assume that it signals an exception instead of 
returning the integer 0. We discuss the assumptions made in the formalism about the 
behavior of the exception handling mechanism of a host programming language supporting 
the abstract data type mechanism. We extend the formalism introduced in the previous 
section to model the exceptional behavior. 

2.3.1 Assumptions about Exception Handling Mechanism 

We consider the exception handling mechanism an integral component of a host 
programming language supporting the data type facility. The exception handling 
mechanism performs two functions: Signalling the exceptions and handling the exceptions 
[52]. Signalling is the way a program notifies its caller of an exceptional condition, and 
handling is the way the caller responds to such a notification. A module implementing a 
data type must provide an adequate interface with the rest of the programming language 
for exception handling. Such an interface can be designed by naming the exceptions 
signalled by the operations along with the specification of information carried as arguments 
to the exception handlers. We will not be concerned with the semantics of the exceptional 
handling mechanism of a programming language in this thesis; we rather consider the 
exceptional handling mechanism insofar as it interacts with the data type mechanism. 

Liskov and Snyder [50] discuss two models of structured exception handling - the 
resumption model and the termination model. In the resumption model, it is possible to 
resume the operation invocation signalling an exception after the exception has been 
handled. In the termination model, the operation invocation is assumed to be completed 



54- 



once it signals an exception. Liskov and Snyder describe many advantages of the 
termination model over the resumption model. In particular, the behavior of the handlers 
for the exceptions signalled by an operation is separated from the behavior of the operation 
in the termination model approach; this maintains the modular structure of the operations. 
In the resumption model, on the other hand, the behavior of Are handlers becomes a part of 
the operation behavior. Though there is not sufficient experience to suggest which among 
the two models is better suited for abstract data types, we have decided to adopt the 
termination model approach because of its simplicity. 

In a language supporting call-by-name argument passing mechanism (or in fact, 
any mechanism in which the argument evaluation takes place inside the procedure body), it 
is possible to implement a data type whose operations can handle the exceptions signalled 
by the evaluation of their arguments. Few recently designed programming languages 
support such an argument passing mechanism for at 1eatst«ttfo reasons: $) Its semantics is 
quite complex, and (ii) it is inefficient to implement Most programming languages 
support call-by-value, call-by-object 152), or cail-by-reference mechanism; with these 
mechanisms, it is not possible to implement a data type having an operation that handles 
exceptions signalled by the evaluation of its arguments. We assume in our work that an 
operation does not handle any exception signalled by the evaluation of its arguments, 
rather such exceptions are handled in a module in which the operation is invoked, as 
arguments are evaluated inside this module. Every operation is assumed to expect normal 
values as arguments. 

If an operation takes multiple arguments, many arguments may signal exceptions. 
The order in which the exceptions are signalled and handled depends upon the evaluation 
order of the arguments of a procedure invocation in the host programming language; we do 
not address this issue in the thesis. We would Jike our formalism to be compatible with any 
reasonable ordering scheme adopted in the host programming language. 



10. However, our approach for defining a data type is general and flexible enough to model a data type 
having operations that handle exceptions signalled by its arguments. Wc simply have to extend the formalism 
proposed in this section. A data type with such behavior can also be specified by extending the specification 
language to be proposed in the next chapter. 



-55- 



We adopt CLU's view of a data type that the handlers associated with the 
exceptions signalled by the operations of a data type are not a part of the data type. This 
view keeps the behavior of the handlers separate from the type behavior and maintains the 
modular structure of the type mechanism. A user of a data type has the flexibility of 
associating different handlers for an exception in different contexts. We will not discuss 
the behavior of the handlers in our research. 

Exceptions signalled by the operations are distinguished by naming mem. An 
exception can carry information as its arguments from the place where the exception is 
signalled, and this information can be used by a handler associated with the signalled 
exception. An operation can signal many exceptions to exhibit different properties of an 
input 

For illustration, we consider the data type bounded stack of integers, of size < 100, 
denoted by Stk-Int-fOO. Slk-Inl-100 is an instantiation of the parameterized stack example 
in J31J; it has the following operations: 

Null a constant denoting die empty stack of integers. 

Push inserts a given integer i at the end of a given stack s. It signals the exception 

overflows, i) if the given stack is of size :> 100. A handler for overflow may examine 

the stack and remove the useless elements to make space for the new element, or it 

may do something else. 
Pop removes the last integer inserted into a given nonempty stack s. When invoked on the 

empty stack, it returns the empty stack back. 
Top returns the last integer inserted into a given nonempty stack s. It signals me exception 

no-topO if s is empty. No-top does not take any aigument 
Replace replaces the last integer inserted into a given nonempty stack s by a given integer L It 

signals the exception can 't-ro^ttcefl) on trie empty gat*. -^-' 
Empty tests whether a given stack is empty or not 

For Stk-InHOO, A . = { Int, Bool } and a = { NuU, Push, Pop, Top, Replace, Empty }. 



56- 



2.3.2 Formalism 

We discuss extensions of the formalism introduce^ in the previous section to 
model the exceptional behavior of the operations. We discuss modifications to the 
definitions and their implications. Spme important definitions ; will be fully presented. The 
discussion and results of Section 2.2 are applicable once these modifications are 

incorporated. 

We first extend the definition of a type algebra given in Subsection 2.2.1. We 
want to keep the normal values of every data type; separate from the exceptions, because 
the exceptions have totally different behavior as compared to the normal values, and 
because the exceptions should not be typed. In addition to a domain corresponding, to 
every D' € A' containing the normal values of D', a modified type algebra has anew domain 
of exceptions denoted as EXV. EXV consists of ajl U exqepta? ^exception value?) 
signalled by the operations of D" € (D) , where for every exception name ex of arity 

D, x . . . x D , and each v. of type D , ex(v. v ) is called an exception value. The 

exception domain EXV in a type.algebr* A ©&P 4%3pe®fj^,^ EXK in A 

inherHS the exception domainof a model A' of D' € A whose "priftdfiat domain V D , is being 
used in A. The exception values signalled by the ftinctfons mterpreting the operations of D 
are explicitly specified Let exv stand for an exception vt&mtgfap^*^ M*n operation a 
signals, tins is modeled asits interpretation^ returning an*ie*ieM«f EXVi 

We now present the modified type algebra: 

Def. 116 An algebra A of type D is a heterogeneous algebra 

(i) for every defining type D' € A, V IV is a value set of D' defined by a model 

. of D'. V D . consist only of the normal values returned by the constructors 

ofD', 

(ii) EXV is the exception domain including the exception domain of a model 
of D' defining \ D> for each D' € A, and the exception values signalled by 

the operations of D, 



57- 



(iii) for every a € a, its interpretation f is a total function of the appropriate 
arity. If D' is the range of a, f ff either results in a normal value in V D< or 
returns an exception value. If any argument to f is in EXV, f is not 
defined on these arguments,** and 

(iv) V D is the smallest set closed under finitely many applications of the 
functions«oriesponding to the constructors of D (i.e., { f ff | * £ $ c }). V D 
only contains the normal values resulting from the constructors. I 

Recall that by assumption, even if f ff is nondeterministic, it behaves deterministically on an 
input on which it signals. We assume that for every D' € A', it is possible to distinguish the 
normal values from the exceptions; this assumption is implicit in every programming 
language supporting exception handling. 

2.3.2.1 Terms, Exception Terms, and Interpretations 

In addition to terms as defined in Subsection 2.2.3, we have exception terms 
defined as follows. 

Def. 2.17 For every exception name ex of arity D x x ... x D^-sa^i ;w# ep is an 
exception term if each e. is a term of typeD r i i 

An exception term notTiaving any variables is called a ground exception term. 

An interpretation of a ground term e in a type a^enra A is^crtdeRned if any of 
its subterms interprets to an exception value. So, T*ropbsh1on ll hi Subsection 2.2.3 gets 
modified to 



11. An equivalent interpretation is to have f ff signal a distinguished exception value, say abort() for example. 
Wehavc not dioi»» trnsmteiTWCWibfr because it grves me impression of &$ exception valuebemg passed as 
an argument to the operation. If wc wish to model a data type witji an operation handling exceptions 
signalled by the evaluation of its argunwrttsVwe^a^nWmalcc^m^ 
return normal values even^ when its arguments signal exceptions, so t^ could return a normal value in ftat 

case. 



-58 



Prop. 15 An interpretation of a ground term of type D" € (D) in an algebra A of type D 
is either a normal value, an exception value, or undefined. ■ 

If an interpretation of e is an exception y^ue or ? ^ufld^W^.then « has a unique 
interpretation in A. An interpretation of an instantiated terrn as wejl as a term in A are 
similarly defined. Proposjtioir 2.2 in Subsection 2.2J »$*«<$£ tfflton* to* modified type 
algebra. 

An interpretation of an exception ground term ex(e., ..., e) in A is defined only if 
each e| A is a normal value of type D.; then, ex(e v ..., e^ j A = pAfij^k' ■"" e i.& 
Otherwise, ex(e v ..., <? n )| A is undefined. The definition of an interpretation of an 
instantiated exception term and of an exception term jn A can be given using the above 
definitions. 

2.3.2.2 Examples of Modified Type Algebras 

the type algebras A., and A* of Set-Fnt given in Subsection 2.2.2 are modified 
to incorporate the exceptions. We will use the symbols A g| and A g| to stand for the 



A $l = [{S,Z,B},EXV;{Nu,ln,Re,H#?m*lifk r - 

The Choose operation signals the exception no-element, which is included in EXV; so 

instead of 0. Otherwise, ihe definitions ofthe functions remain tlje same. Similarly, for 
A* t , we have 

Aj, = [ { SQ 1 , Z, B }, EXV; { Nu 1 , In 1 , Re 1 , Ha 1 , Si 1 , Ck 1 } 1 
where Ch ] (<>) ^ no-clement(), and the definitions of other functions remain the same. 

We present a type algebra A g|k of Stk-Int-100. 

A «k ■= I i 8 ? ^ B *' EXV; { Nu ^ ^^? e ^il 
where Z and B are the value sets define^by the medfife^liitaadfioelfespectively. Ami, 

SQ' is the set ofall sequences of integers of length £100, 

SQ* = { <>, <0>, <1>, <-!>, <2>, <-2* <0,$>, <&, !>; <%*!>, . ■;■;;-} 

The interpretations of the operation names are defined as follows: 



59- 



Nu ^ <> 

Pu^ / m >,i) ^ C overflow^,..., /,>,!) if m £ 100 

/ </„...,/ ,i> otherwise 

PoC</ 1 /.» ^C<> ifm = 

l<i v ...J^ otherwise 

T<t<i v ...J m >) ^fno-topO ifm = 

/ / otherwise 

ReCOj / m >,i) £fcan*t-rcpfece(i) if m = 

l< /,, . . . , kj, i> otherwise 

EmC</ lt ...,/ n >) ^(T ifm = 

^F otherwise. 

Henceforth, by a type algebra, we mean a modified type algebra unless stated otherwise. 
2.3.2.3 Observable Behavior and Distinguishability 

The definition of Bool given in Subsection 2.2.4 remains the same, because no 
boolean operation signals. ' 

As was stated earlier, if the operations of a data type exhibit exceptional behavior, 
its values can also be distinguished due to its exceptional behavior. If a sequence of 
operations signals an exception on one value and doesnot signal on $he other, then the two 
values are distinguishable. If a sequence oft operations sghaW on b©& values, the two 
values are distinguishable if the sequence signals different exceptions. Thus the behavior 
of the vaiies of a data type can also be observed using the excepuoji.handling mechanism 
of the host programming language. Even if a data type does not have any defining types, 
its values can be distinguished if its operations signal exceptions. 

We define the distinguishabifity relation i on V D arid the 1 distinguishability relation 
on the exception domain EXV in A mutually recursively, using the dtsiSngwishabflity 
relations on the domains corresponding .to the defining t^pes.s It should be made sure that 
arguments to exception names are such thai the two definitions are well founded. The 
definition of distinguishability on exception values incorporates that (i) two exceptions 



-60 



having different names are distinguishable, and (ii) two exceptions having the same name 
but distinguishable arguments are distinguishable. 

Der 2.1fc Given two exception values exfy v . . , v n ) and ex£v' Y . . . , vp in EXV, they are 
distinguishable iff (i) ex x ± ex 2 s or (ii) if ex l = ex 2 and rn = h, then for some 1 < i < m, v. 
is distinguishable from v! , Two exception values are observably equivalent iff they are not 
distinguishable. I 

We denote the observable equivalence relation on EXV by Egj^. 

Def. 2.19 For an algebra A of type D having no defining types and whose operations do 
not signal, all values in V D are observably equivalent. 1 

Def 2J0 Two normal values ^ and v 2 in V^ of an algebra A of. type D are distinguishable 
iff there exists a term with one variable of type D, expressed as c(x), such that one of the 
following conditions holds: - 

(i) the instantiated terms clx/vj arid c[x£r£ interpret to distinguishable 

exception values in A, 

(ii) cjx/VjJ interpret to a normal value and c\x/v^ , interprets to an 
exception value or vice versa, and 

(ui) c[x/VjJl A and c[x/v£ A are normal values and { c[jc/vJ| a } is 
distinguishable from { c[jt/v 2 J| A }. I 

Note that in the above definition of distinguishability, we have not included the case in 
which exactly one of cfx/vj and cfx/vj is not defined because the condition (ii) above 
takes care of it 

Det 2^1 Two normal values v x and v 2 are observably equivalent iff they are not 
distinguishable. I 

Theorem 2.1 of Subsection 2.2.4 extends to the above definition of observable 
equivalence relation. E^ v is also an equivalence relation. 

We extend the definitions of congmerice/homomotprilsm, and isomorphism for 



61- 



type algebras having exception domains. The mappings from flie normal domains of a type 
algebra A to the corresponding normal domains of another type algebra A 2 induce a 
mapping 4> EXV from the exception domain EXV X in A r to the exception domain EXV 2 in 
A . The exception names act like operations; they preserve these mappings. Given 
"A 1 = [{Vj.|D'€A'} i lXV 1 ;{iJ|a€tfn 
A 2 = [{V^|D'€A'},EXV 2 ;{f^a€0}], 
for every exception name ex of arity D l x ... X D n , 

<ex(v l v n ), e^*,, (v x ) * D (v n )) > € * EXV . 

Theorem 2.2 modified to say that E = { E D< | D' € A' } U { E EXV } is the largest 
congruence in A holds; the proof is similar to the proof of Theorem 2.2. E captures the 
normal as well as the exceptional behavior of the functions of a type algebra A. 

We define a reduced algebra in the same way as in Subsection 2.2.4 using the 
congruence E. The definition of behavioral equivalence relation on type algebras is the 
same as in Subsection 2.2.5. The definition of isomorphic equivalence used in the 
definition of behavioral equivalence is extended by including the mapping <J» EXV in the 
family <s> and requiring fc^y also to be a bijection. The theorems of Subsection 2.2.5 
exhibiting that the definition of behavioral equivalence of unmodified type algebras indeed 
captures the desired intuition extend to the modified type algebras. The results and proofs 
are modified to incorporate the fact a ground term V (respectively, an instantiated term 
e[X/V\) may interpret to a normal value, an exception value, or be undefined (see 

Appendix II). 

A data type D is defined in the same way as in Subsection 2.2.6 as a set of 
behaviorally equivalent type algebras. Let M D stand for this set. Every model in If D now 
has the exception domain EXV. The observable equivalence and distinguishability 
relations on the ground terms of type D are defined as in Subsection 2.2.7. We incorporate 
the facts that two ground terms whose interpretation in every model in M I} are undefined, 
are observably equivalent, and that if one of the ground terms has an undefined 
interpretation whereas the other does not; then the two ground terms are distinguishable. 



62 



2.3.2.4 Comparison with Goguen's Approach 

Our approach is similar to Goguen's approach [20, 21] of modeling the 
exceptional behavior of a data type in the sense that exceptions are named and can have 
arguments. However, there are cjucjaj differences in the twp 2 desi^a philosophies, tn 
Goguen's approach, the definition of a new data, type can possibly e&ena' the definitions of 
its defining types. This is so because the exceptions (called not-ok values in [20J) are typed 
just like the normal values (called ok values, in [20JJ. Instead of having a single domain of 
exceptions, Goguen partitions a value set of D into the exception values and the normal 
values; the exception value part of the value set expands as new types using D are defined. 
For example, the definition of SCk-Int-IOC (would extend the definition of Int by defining a 
new integer no-top (which is a not-ok value). We consider this as violating the modular 
structure of the definitions. 

The OB) language of Goguen and Tardo {21] a|tows the handlers for the 
exceptions signalled by die operations to be specified as a part of the type specification, 
thus making the type behavior complex. We suspeci that they adopt this approach because 
of their attempt to develop the algebraic semantics of a complete programming language 
including the control structures, So, they do not separate the semantics of the exception 
handling mechanism from the data type. 

In contrast, we have concentrated on the behavior of data types only. We have 
separated the exception handling mechanism firom the data type mechanism. We have 
only considered components of the exception handling mechanism related to die type 
definition mechanism. We do not consider the behavior of exception handlers as a part of 
a data type for reasons discussed earlier. W^ljelieve that die type mechanism should only 
provide an adequate interface to the exception handling mechanism of the host 
programming language. We separate the exception domain from the domain of normal 
values as exceptions have different behavior ferom me normal values. We do not type 
exceptions either because doing so seems meaningless. In this way, we have been able to 

,; - : ' :■■"',' ; ■• ii-: ,; y l : ; : H '■' •■■■v ; . 'jjj.i :)3& .k'tfi y^jb ■.■~<\uj'j{lj ?.u'ii'iy:-i y:'>": .yy-'V'-^U 

define the behavior of the operations of a data type completely and uniformly, without 
extending the definition of any of its defining types thus preserving the modular structure 
of the type mechanism. 



-63- 



2.3.3 A Simpler Approach 

In this subsection, we discuss anQther approach for modeling the exception 
behavior of a data type, which is simpler than ttie approach discussed earlier. This 
approach has been generally assumed in the hlteratute km Morale specification of data 
types when the authors do not wish to discuss the exception behavior of the operations 
[29, 77]. The ADJ group's work [23] is an attempt to formalize it, ttfd Guttag [31] embeds it 
in a rich way in a specification language. We discuss this approach for. two reasons: CO our 
discussion is simpler and more natural than that of [23J,(iifewri discussion would place the 
works of those who Jiave implicitly or /explicitly assumed this approach of modeling 
exceptional behavior on affirm basis, and (iii) our disc^sstoiipr^yides a semantic basis of 
Guttag's specification language. 

In this approach* exceptions signalled by operations having the same range are 
not distinguished and no information is passed with art exception to its handler. An 
operation on an input either returns a normal value or signals an exception failure. For 
example, the operations Push, Pop, and Replace signal tfie same exception failure. Every 
operation is assumed to expect normal values as arguments. M an argument to an operation 
signals failure, then the operation propagates it by signalling it 

Such exceptional behavior of the operations can be modeled by extending the 
domain of every D' € A' in an algebra A of type D (as defined in Subsection 2.2.1) with a 
special exception failure ; we denote it by erijy . Whenever an operation a signals failure, 
its interpretation f in A returns er r D - , where D' is the range type of a. So we have 

A=[{V B U{»rr D }}U{%U{«rr D ,}lP'€A};{y ff €OH 
If any of the x/s is err D . then f^,. . t , x n ) = *tr& , ue^ f, is strut with respect to its 
arguments. We assume that for every D' € A', it is possible to distinguish between the 
normal values and the exception value efrjj. . 

We modify the definition of Bool given in Section 2.2. The model B of Bool is 
extended to have the exceptional value «r b . 

B* = ( { f tru«, fatee, e« b \\; { T, F , V* ~, A, =■>,*■*} ), 
where the definition* of the boolean operations remains the same on normal values. 



-u 



Besides, every function is strict Bool is defined as the set of^^ty^alg^bi^ isomorphic s-to 
B\ 

We discuss a type algebra A ' of Stk-Int-lW. 

A 8 ; k = [{SQ%){«rr 8 ^},^B '^ { Nil', Pli\ Po\ T*% Re\ Em' } J, 
where B' = Bu{#rr b }, 
Z' = Zu{«w f K 

No* £ 



<> 



if rt ^ 100 

otfferwise . 



ifm» 
otherwise 
if in ^0 
otherwise 



ifm = 
otherwise 



(_</,,.. .,/ m ,i> 

po^,...,^ £ro 

u 

RsX<^.-.'.-,/ H >,0 ^(•",tk 

EmW *>) £(t ifm=0 

J. Bl J 

(_F otherwise; 

The theory discussed in Section 2.2 directly extends to the above algebras also. 
The definition of the interpretation of a term in Subsection 2.2.3 easily extends. A ground 
term of type D' or an instantiated term may interpret to •*'#- the definition of 
distinguishabihty of values of D in a type algebra also extends in a straightforward manner. 
We want to add to the definition that (i) every nbrniat viiud of D is dfefingaishable from 
the exceptional value *r* D , and (ii> .twoiwofmal values ^ *md ^ in V D of A are also 
^nguishable if there is a term c(x\ such that t [xir$ interprets sa an lexceptiofiai value, 
whereas c [x/vj interprets to a normal value, or yjc&veite. 

The behavioral equivalence relation o» modified type algebras is a simple 
extension of the definition given in Subsecti©ii2J&5. . The modified definition of 
isomorphic equivalence requires that evei7tnappiag|^itt* maps the exception value 
erf«- in A, to err. v in A^ . Other conditions remain the same in the definition. A data type 

. D . 1 D- 2 



-65 



D is a set consisting of all behaviorally equivalent type algebras of the above kind. The 
observable equivalence and distinguishability relations on ground terms are defined in the 
same way as in Subsection 2.2.7. 



66 



2.4 MutuaHy Recursive Data Types 

We have assumed so far that data types can be de»gne<l t hieraiEh|cally one at a 
time and that the data types on which a data type D depends can be designed 
independently of D. These assumptions are not valid for a subclass of data types. In some 
cases, it may be more meaningful to associate an operation with a collection of data types, 
instead of a single data type; for example the conversion operations between the data types 
fixed point number and floating point number. Or a group of data types may be mutually 
dependent such that they cannot be defined one at a time, for example, data types picture* 
contents, component, and view in [32] are mutually recursive. In the latter case, the 
dependency relation on data types as defined in Section 2.1 will have cycles. 

For the above cases, we consider groups of mutually recursive data types together 
as one entity, and define direct dependency and dependency relation on such groups and 
nonrecursive data types in an analogous manner so that the relations do not have any 
cycles. A group of mutually recursive data types can be then defined hierarchically when 
considered as one entity. 

Let D stand for a group of new types being defined together. Let A stand for the 
set of their defining types, assumed to be defined elsewhere, and stand for the set of their 
operation names. 

A type algebra for a group of new data types D is a straightforward extension of a 
type algebra for a single data type D. It has a domain corresponding to every D € D in 
addition to the domains corresponding to every defining type D* € A and the exception 
domain EXV. It also has a total function (deterministic or nondeterministic) corresponding 
to every operation name in Q. Instead of having a single principal domain as in case of a 
type algebra for a single data type, we have many distinguished domains in a type algebra 
for D: Every domain corresponding to D € D is a distinguished domain. In order for the 
distinguished domains to be nonempty, it is necessary that at least one of the data types in 
D has a basic constructor (a constructor that does not take any argument of a type in D). 
Furthermore, all the distinguished domains must be constructible mutual recursively. 

The theory developed for a single data type easily extends to a group of mutually 
recursive data types. We can directly extend the definition of the interpretation of a term 



-67 



in a type algebra defined above. The observable equivalence and distinguistiability 
relations can be similarly defined on V D for each D € D. They induce the observable 
equivalence and distinguishabllity relations on the ground terms of type D. Behavioral 

' , ? :■■■■ i 

equivalence relation on type algebras can also be defined analogously. 

A group of mutually recursive data types D is a set of all behaviorally equivalent 
type algebras of the above kind. Every type algebra in the equivalence class is a model Of 
D. A model of D defines a value set of each D € D, which is the distinguished domain 
corresponding to D in the modeL 



68 



3. Specification of m Absti act Data TyP£ 

In this chapter, we discuss a method for srj>ecifyiti| abjstract data types. Like the 
definition method, the specification method is hierarchical and rnodular. We describe a 
specification language in which data types having nondeterministic operations and having 
operations exhibiting exceptional behavior can be specified, fhe main £0^ in designing 
the language has been to develop a goo^ notation for expressing |he design of the data 
component of programs. The specification language should be as flexible as possible to 
enable a designer to conveniently express his/her intent We do not restrict a specification 
to specify a single data type only, instead a specification in general specifies a set of related 
data types sharing a common behavior. A specification only expresses properties particular 
to the data type(s) being specified. Properties common to. all data types, for instance, the 
minimality property, are not specified. They are instead assumed in the semantics of the 
specification language. 

Since a data type is a set of models, its specifications) must capture the properties 
common to these models. The specification must specify the syntactic structure as well as 
the observable behavior of these models. There can be many ways to do this. One way is 
to present a model that acts as a representative of the above set For instance, the definition 
of a denotation of a data type D can serve as its specification; as an example, the model A sj 
of Set-Int can serve as a specification of Set-Int. A data type is specified in this way in the 
model approach [3J, which is briefly discussed in Section 12. This method has a 
disadvantage that since a particular representation of the values of the data type is used to 
specify the data type, there is a danger of the irrelevant properties of the model being 
associated with the data type. This shortcoming of the model approach can be 
circumvented by choosing an appropriate semantics of the specification method as in [3]. 

Another way is to specify the properties that characterize the observable behavior 
of all models of a data type. We adopt this approach, which is called the axiomatic 
approach in Section 1.2. We specify the observable behavior as a finite set of properties of 
the operations of D. These properties are expressed abstractly without referring to any 
particular model of D and without assuming any particular representation of the values of 



69- 



D. They are presented as first order formulas relating sequences of operations that return 
observably equivalent values. The reasons for choosing the axiomatic approach are: 

(i) A theory of a data type can be directly developed from its axiomatic specification 
without referring to any other domain of discourse, 

(ii) our work can be integrated with the work on the development of axiomatic systems 
for reasoning about control structures [17,36Jjand the automation of the verification 
process, and 

(iii) the methodology for proving the correctness of an implementation of the data type 
with respect to its specification is simple and natural for a wide t\as§ of specifications. 

Instead of allowing arbitrary first order formulas; we restrict the axioms to be 
equations because 

(i) an equational specification is amenable for deducing the properties of a data type (see 
the next chapter, where the proof theory of a data type is developed from its specification; 
also see Musser [60] for discussion of a theorem {Hover for equational specifications), 

(ii) an equational specification is easier for a programmer to understand (see [29] for a 
discussion on viewing equational axioms as recursive programs), 

(iii) certain desirable properties of specifications can be guaranteed by putting constraints 
on equations [28], 

(iv) an equational specification has been found to be more suitable for semi-automatkally 
deriving an implementation of a data type [64, ^J; and 

(v) a model can be more easily constructed from a equational specification than from a 
specification whose axioms use existential quantifiers [16}. 

Our specification language allows a specification to introduce a finite set of 
auxiliary functions to express the properties of the operations. An auxiliary function is not 
an operation of a data type; rather it is a helping function in a specification. So it is a part 



70 



of a specification of a data type, and not a part of the data type itself. 1 The use of auxiliary 
functions in a specification is a necessity, because if axioms are restricted to be equations 
without auxiliary functions, many data types cannot be specified [2, 53, 71, 43]. With the 
help of a finite set of auxiliary functions, one can specify using a finite set of equations, 

(i) any data type with a recursively enumerable (r.e.) value set and a finite set of total 
deterministic computable functions [28, 431 and 

(ii) any data type that can be specified using a recursively enumerable set of equations, 
restricted conditional equations, or positive conditional equations |43]. 
In this sense, our specification language is quite expressive. (For a detailed discussion of 
the expressive power of an equational language with auxiliary functions and how it 
compares with other algebraic languages for specifying data types, see [43].) Besides, we 
have found auxiliary functions convenient and useful m expressing the properties of 
complex operations; the judicious choice of auxiliary functions often results in 
specifications that are relatively easier to write and understand ascorapared with equivalent 
specifications written without using the auxiliary functions. 

We discuss the specification language ia die first section. Different components 
of a specification are described. The semantics of a salification is given in the second 
section. It is defined to be a set of related data types sharing the common behavior 
captured by thespedfieatknu Jn the third section, we state what it means % a data type to 
be (precisely) specifiable by a specification; equivaleace among specifications is defined 
The fourth section discusses the specification of the data type boptean. In the fifth section, 
we discuss two structural properties of a specification, consistency and behavioral 



1. An auxiliary function should not be confused with an internal procedure needed in an implementation of 
a data type to implement its operations. (Chapter 5 discusses internal .procedures.) An auxiliary function 
however serves the same purpose in a specification as an internal procedure in an implementation. It is not 
available to the users of a data type, and is used only for expressing and proving properties of the data type 
from its specification. 

2. We conjectured in [43] that even if axioms arc allowed to be conditional equations (restricted, positive, or 
unrestricted), there arc many interesting data types that cannot be specified without auxiliary functions. 

3. Guttag [31J rightly compares the use of auxiliary functions in a specification with the use of subroutine 
(procedure) abstraction while writing a complex piece of software. 



71 



completeness, expressed in terms of relationships aiflsng the set of data types specified by 
the specification. The consistency property requires that a specification specifies at least 
one data type. The behavioral completeness' property requires that a specification 
completely specifies the observable behavior ofthe operations on intended inputs; it rules 
out only intentional incompleteness in a specification. In the sixth seetidn, we compare our 
specification language With the works of Zitles [77J, Guttag et al. 129, 31], the ADJ group 
[23i, Goguen |20J, Burstall and Gogiien f7], dogiien ah&^TMio ]21], and Nakajima et al. 



-72 



3.1 Specification Language 

The specification language has a sjnjje^n&ctic unit, called & specification module 
(or simply a specification^ which in gene^^pecifif? a set of rje|ate4 T 4^a types. .. W£ first 
discuss specifications of ru^archicj^y staictured (nQnrecursiv€)jJatat^pes; at the^nd of 
the sectiqn we diseussa sp|cU^Q|vi#fwtuatty tecursiye data ^5>es. r , 

.We will use a -single name to stand for any of trje ,data types specified^ by a 
specification. We may use the same name as the name of its specification whenever }|is 
possible to disambiguate from the context whether a name refers to a data type or its 
specification. When we consider more than one specification of a data type, we use 
different names for different specifications. Though a long name for a concept may convey 
information about the behavior of the concept, the long name can be inconvenient to use, 
so we allow abbreviations for long names to be introduced in a specification preceded by 
the symbol as. Let D stand for a type being specified by a specifications. 
A specification in general has four components: 

(i) Operations, ' 

(u) Auxiliary Functions, 

(ill) Restrictions, and 

(iv) Axioms. 

The operations component specifies the syntactic properties of D, and the restrictions 
component and the axioms component specify its semantic properties. We illustrate 
different components of a specification using the specifications given in Figures 3.1 and 3.2. 
Figure 3.1 is a specification of Set-Int. Figure 3.2 is a specification of a set Stk-Int of data 
types; the data type Stk-Iiit- 160 defined in .-Chapter 2 is in this set 

A specification is hierarchically structured; it refers to the specifications of data 
types other than D assuming that these specifications are given elsewhere. Data types other 
than D may have already been specified, or they will be specified later. For example, the 
specification of Set-Int in Figure 3.1 refers to a specification of a data type Int. We assume 
that Ini is specified elsewhere. Since a specification of hit can specify a set of data types, 
Int in Figure 3.1 stands for any data type in the set 



-73 



Figure 3.1. Specification of Set-Int 
Operations 



Null 


: -» Set-Int 


Insert 


: Set-Int X Int -» Set-Int 


Remove 


: Set-Int X Int -> Set-Int 


Has 


: Set-Int X Int - Bool 


Size 


: Set-Int -» Int 


Choose 


: Set-Int -» Int 




-+ no-elementO 



as 



as x 2 € x 1 
as #{*j) 
nondeterministic 



Restrictions 

#(s) = O => Choose(s) signals no-element 

Axioms 

Remove(0,i) s 

Remove(lnsert(s,l1), i2) = if it = i2 then Remove(s, it) else InsertCRemoveCs, i2),H) 

i € m F 

i1 € lnsert(s, i2) = if 11 = i2 then T else 11 € s 

#(0) s O 

#(lnsert(s,i)) == if i€sthen #(s)else #(s) +1 

Choose(s)€s s T 



Whenever we introduce a new construct of a speciffcation in this section, we 
informally discuss its meaning for motivation and clarity of expositioa. As was stated 
above, the precise semantics of a specification will be#»efc in the next section. 



3.1.1 Operations 

This component specifies (i) the domain and range, and 00 the names of the 
exceptions signalled by every operation of D on its intended inputs, along with the types of 
the arguments to the exceptions. It is a sequence of specifications of the following form: 



74 



g:D,x...xD -»D' 

1 n 



-.ex 1 (D 11 .....D ]Bi ) 

-» ex k (D kl , . . . , D^), ; 

where D,x...xD is the domain of o and D' is its range* * signals exceptions having 

1 n 

names ex } , . . . , e* k , whose argument types are also specified. If iflfcopefatioirls specified 
to signal an exception, the exception must be listed in its, syntactic specification. If a does 
not take any argument, then it is a constant of its range type. If an exception name ex does 
not take any argument, it is expressed as ex() or simply ex. The operations component of a 
specification of D indirectly specifies the A and O^of ffc ; 

When an abbreviation is introduced for an n-ary operation name, we can specie 
how the abbreviation distributes over the arguments using the argument place holders 
x v . . .-, x n . For exampte. the operation Has of Set-Jnt is abbreviate*! to -'€' and it is used as 
'x € x * We discuss later (Subsection 3.1.5) how nondeterministic operations are specified. 

3.1.2 Auxiliary Functions 

This component is optional; it exists if auxiliary functions are used in writing the 
Axioms and the Restrictions. As was discussed before, auxiliary functions are introduced to 
enhancetfee expressive power of the specification language and to make the language more 
flexible so that specifications are easieMo write and understand. We do not recommend 
choosing auxiliary functions randomly to express Ifclfc bsl^ivii^ofi^iepeRHiQns. Inistea4 
they should be chosen with care. An auxiliary function should embody a subsidiary 
procedural abstraction needed to express the operation behavior, it '% a good design 
practice to completely specify an auxiliary function even if its behavior is needed only for a 
subset of its input domain. Furthermore, if an auxiliary function is of the result type D, it 
should not have to construct values that cannot be constructed by the conductors of D. 



-75- 



Every auxiliary function is deterministic, and there are no restrictions associated with k. 
For example, the specification of Stk-Inl in Figure 3.2 uses the auxiliary function Size. 

We specify the domain and range of every auxiliary function used in the 
specification in the same way as the operations. Let A f stand for the set of all auxiliary 
functions used in a specification. An auxiliary function may use a data type not in A* 
(= A U D) as a component of its domain or as its tange: -we- call such a data type as an 
auxiliary type. Like a defining type, every auxiliary type is assumed to be specified 
elsewhere. Let A f stand for the set of auxiliary types used by me Hilary functions in A f . 
If a specification does not have the auxiliary functions component, then A f = and 

A t = 0. 

We extend the definition of a term in Subsection 2.2.3 to include terms 
constructed using the auxiliary functions and the operation symbols of the auxiliary types. 

Def. 3.1 An auxiliary term of type D' € U , (D")* is defined inductively as 

D" € { D } U A t 

(i) a term of type D', 

(ii) if a € A f such that its domain is Dj X ...xD n and its range is D', then a(e 1 , . . . , ej 

is an auxiliary term of type D' if and only if each e. is an auxiliary term oftype D. . t 

Clearly, if A f and A f are the empty sets, the definitions of an auxiliary term and a term 
coincide. An auxiliary exception term can be defined by replacing terms by auxiliary terms 
in the definition of an exception term in Subsection 2.3.2. Henceforth, by a tern^we mean 
an auxiliary term, and by an exception term, we mean an auxiliary exception teim, 
stated otherwise. 



4. These constraints on auxiliary functions arc imposed for convenience and simplicity. Our formalism 
would work equally well if these constraints arc not imposed. 



76 



Figure 3.2. Specification of Stk-Int 

Stk -Int as stk 

Operations 

Null : -> Stk 

Push : Stk X tnt -» Stk 

-* owrftowfSttt, tnt) 
Pop : Stk -+ Stk 

Top : Stk -» Int 

"'-* no-tepO ; 
Replace : Stk X Int -> Stk 
Empty : Stk -» Bool 

Auxiliary Functions 

Site : Stk -» Int as #(x) 

Restrictions 

PrdPopU)) :: ~Empty(s) 
/Ve(Replace(s, I)) •: ~ Empty(s) 

Empty(s) => Top(sJ s/gf&zfr no-topO 

Push{s, i) signals overflow(s, i) =» #(s) ^ 100 

Axioms 

Pop(Push(s, 0) s • . 
ToptPushfs, 0) si 
Replace**, I) a Pu»h<Pop<*), i) 
Empty(Null) = T 
Empty(Push(s, I)) a F 
#(Null) = O 
#(Push(s,i))= #(s) + 1 



77- 



3.1.3 Restrictions 

The restrictions and axioms components of a specification specify the normal as 
well as the exceptional behavior of the operations. They also define the auxiliary functions, 
if any, used in the specification. The axioms component specifies the normal behavior of 
the operations. The exceptional behavior is specified as a separate layer over the normal 
behavior. This is achieved by specifying restrictions on the operations in the restrictions 
component An axiom in the axioms component holds only if the operations used in the. 
axiom satisfy the specified restrictions. The restrictions component is an extension of the 
Restrictions Specifications of Guttag [3 1J. 

The restrictions component is a set of restrictions; every restriction is associated 
with an operation. There are two kinds of restrictions: 

(i) Preconditions, and 

(ii) Exception Conditions. 

Every exception listed in the syntactic specification of an operation should have an 
associated restriction specifying the input condition when the exception is signalled or may 
be signalled by the operation. The boolean conditions in the exception conditions for an 
operation must be disjoint Another constraint on the boolean conditions when they use 
non deterministic operations is discussed later* As? » stated in the first chapter, for 
operations having complex behavior, it may be very difficult to specify conditions on their 
inputs under which they signal a particular exception. This approach of specifying the 
exceptional behavior is not suitable for such operations. 

3.1.3.1 Preconditions 

The precondition restriction for an operation specifies the subset of its input 
domain on which the operation behavior is of interest The operation is expected to be 
invoked on inputs in this subset; it is the user's responsibility to ensure this. The operation 
behavior is specified only on these inputs; it is left unspecified on inputs outside the subset 
because it does not matter. An operation can either signal an exception or return a value 
on an input not satisfying the precondition. For example, in certain applications, we may 



-78- 



not care how the operation Replace in Figure 3.2 behaves on the empty stack as it is never 
going to be invoked on the empty stack. It could either return a stack value or signal an 
exception. Also see [51, 32] for more examples of such operations, f'f a specification 
commits to a particular behavior on an input not satisfying the precondition, for instance 
signalling an exception, many implementations wouM be ruled out Our approach is to 
encourage a designer to specify only that portion of the data type behavior which is of 
interest to him and allow the rest of the type behavior to be left unspecified so that an 
implementor has the maximum flexibility. 

The precondition restriction for an operation a € Q is specified as: 
Pre(a(X)) :: P<X), 
Where V(X) is a boolean term having x { , ....,*' (the input X) as its variables, and it cannot 
signal on X. The axioms involving a hold only if the input to every invocation of ex satisfies 
the precondition ¥{X). If the Restrictions component does not specify a precondition for 
an operation, the operation is assumed to be specified for its entire syntactic domain, i.e., its 
precondition is T. For example, ~ Empty(s) is the precondition for Pop as well as Replace 
in the specification of Stk-Iat in Figure 3.2. The specification does not specify the behavior 
of these operations for the empty stack. No precondition is specified for any other 
operation, so their preconditions are XL Similarly, no precondition is specified for any 
operation in the specification of Set-let in Figur&M. If a precondition different from T is 
specified for an operation or, v is said to have a nonirmal precondition. Xjet F stand for 
the precondition for v. 

If an operation a does not signal on an input not satisfying its precondition, it 
cannot return an arbitrary value. If a is a constructor, as for example, the operations Pop 
and Replace in Figure 3.2, the result must be constructible by the ccir^tructors of D using 
inputs satisfying the associated preconditions. Similarly, if a is an observer, then it must 
return a value of its result type. 



79 



3.1.3.2 Exception Conditions 

There are two kinds of exception conditions: 
(i) Required exception conditions, and 
(ii) optional exception conditions. 

A required exception condition for an operation e is expressed as 
R(X) => o(X) signals exie^...,* J, 
stating that if the input X satisfies the precondition P ff and the boolean condition R(A), 
wliich is a boolean term, then the operation a must signal the exception ex having e v .... e k 
as the arguments to its handlers). The exception name ex is of arity Dj x ... x D k , and 
each e is a term of type D. having variables onlyfrom the set { or-, . . ., x a }. For example, 
in Figure 3.1, the operation Choose is specified to signal the exception no-element on the 
empty set. In Figure 3.2, the operation Top signals ao-tep on the empty stack. WecaUthe 
above exception condition required because the operation is required to signal the 
exception. It is possible to specify an operation signalling different exceptions for different 
subsets of inputs. 

In certain applications, it may be restrictive to require that art operation signal an 
exception when its input satisfies a condition. At the same time, it may not be desirable to 
leave the operation behavior completely unspecified. Instead, we would like to place 
constraints on the behavior. If an input to the operation satisfies the specified condition, 
the operation is specified to have the option of either signalling the specified exception or 
returning a normal value. In case the operation chooses not to signal, it must behave as 
specified by the axioms. Optional exception conditions are introduced to capture such 
behavior of ah operation. An optional exception condition is expressed as 
a(X) signals exie^ . . . , e k ) =*• 0(X), 

stating that in case a signals an exception ex having e e k as arguments and the input X 

satisfies the precondition P , then the input X must also satisfy the boolean condition 
0(X), a boolean term. 

Optional exceptions are especially useful for specifying a set of similar data types 
having values whose capacity (size) has different upper bounds. It is possible to state a size 



-80- 



requirement on the values of the data type, but at the saflifr time not be very restrictive 
about the requirement An implementor could decide on the exact bound based on 
convenience insofar as the specified bound condition is met. Such behavior of a data type 
is specified by stating that the constructors have the option to signal exceptions. 

For example, in the data type Stk-Inl-100 defined in the previous chapter, the 
operation Push signals if its stack ailment is of size 100 If the desired requirement is that 
a stack value be able to store at least 100 integers, mis behavior of Push is very restrictive. 
It rules out a implementation supporting stack values of size > 100, even though the. 
implementation has the desired behavior except that Push does not signal exactly on stacks 
of size 100, but rather on stacks of size 12&> Wespediy $*e desired mjtiirement in 
Figure 3.2 by stating that Push optionally signals; whenever Push signals overflew, its stack 
argument must he at least of size 188§ rn this way, a specification specifies the least upper 
bound on me size of the values of «<fe«a typ^, aadithte re^pons^lity of decidiiig tfce e**ct 
upper bound is delegated to- an implementor. Such a speciftcation is flexible and not 
restrictive. 

3.1.3.3 Discussion 

Note that the nontrivial precondition restrictions and die optional exception 
conditions leave the specification of the operations incomplete because the operation 
behavior is not completely specified on a subset of inputs. An r operation could behave on 
such inputs in any way consistent with the specified behavior. That js why a specification 
in general specifies a set of related data types; the operations of these data types have the 
same behavior for a subset of their syntactic domains. For example, Stk-Int specifies data 
types having stack values whose size has different upper bounds > 100. The operations of 
these data types behave the same way on stacks of size < 100, except that Pop and Replace 
of different data types may behave differently on the empty stack. We call such 
incompleteness in a specification as intentional incompleteness, in contrast to unintentional 
incompleteness introduced because of the omission on the part of a designer in specifying 
the properties of the operations. 

It should be intuitively clear that if no nontrivial precondition and no optional 



81- 



exception condition are associated with any operation, and the axioms completely capture 
the observable behavior of the operations, then a specification specifies a single data type in 
case the specification of every defining type also specifies a single data type. We elaborate 
this informal statement later in the chapter. 

3.1.4 Axioms 

This component specifies the normal behavior of the operations in S2 and the 
auxiliary functions in A. if they are used in a specification. The behavior is specified as a 
finite set of equations of the form "e = e ' where e and e are auxiliary terms of the same 
type; at least one of e and e must have its outermost symbol in Q U A t , otherwise an 
equation would not be specifying a property of D. 'e = e^ informally means that the 
sequences of operations expressed by the terms e and e have the same behavior, i.e., when 
values are substituted for variables in e and e 2 , the instantiated terms interpret to 
observably equivalent values. The symbol '=' is interpreted as the observable equivalence 
relation. The equations attempt to capture the observable equivalence relations on ground 
terms defined by the data type(s) being specified, which is discussed in Chapter 2. 

If a specification does not have the restrictions component (i.e., the operations do 
not signal exceptions and there is no nontrivial precondition associated with any operation), 
then the variables in an axiom are universally quantified: Any value of the appropriate type 
can be freely substituted for a variable. 

If a specification has a restrictions component, then an axiom is interpreted in a 
different way; the variables in an axiom cannot be freely substituted. We must also 
consider the restrictions imposed on the operations appearing in the axioms. The values 
substituted for the variables must satisfy the following two conditions: 

(i) For every operation a having a nontrivial precondition P , the arguments to every 
invocation of a in the axiom must satisfy P , and 

(ii) an instantiation of any subexpression in the axiom must not interpret to an exception 
value. 

The condition (ii) above is equivalent to requiring that an interpretation of an instantiation 
of e or e is neither undefined nor an exception value. For example, consider the axiom 



82 



Replace<M) s Push(Pofl(sM) (*) 

in the specification of Stk-Int in Figure 3.2. h applies only for the values of s for which 
- Empty(s) holds, which is the precondition for both Jlepfeee arid Pop. Furthermore, Push 
must not signal overflow on the result returned by-Fop; which it cannot in any case. The 
equations characterize the normal behavior of the operations in this way. 

It is often the case that two terms are observably equivalent only when a condition 
is placed on their variables; for example, in the second axiom in the specification of Set-Int 
in Figure 3.1, Removc(Insert(s, il), i2) is observably equivalent tp I«sert(Remove(s, i2), il) 
only if il and i2 are not equal. So, while writing the axioms, it is convenient to assume an 
auxiliary function if-then-else corresponding to every D[ € A' U A t . The definition of 
if-then-else is given as: 

ir-then-else : Bool X D' X D' - D' as if ^ then x 2 else * 3 

if T then x eke y .* x 
if F then x else y = y. 

Since these functions are used frequently, they are assumed to be implicitly defined 
whenever needed They are not explicitly stated in the auxiliary functions component of 
the specification, and are not in A f . If Bool is not a defining tyf£, then Bool is assumed to 
be an auxiliary type. An axiom of the form i e l = if 6 then e 2 ' stands for the equation 
' e i - ittben-elsete e 2 , e): We call \ & if b then e 2 a conditional equation* It is 
equivalent in its interpretation to the formula 'b = T m e t m e^ An axiom of the form 
e, s if b then e u else e n ' standi for the equation e, s iHfceiretee<M u , <? 12 V It is 
equivalent to the following two conditional equations 
e 2 = if Mhen e^' 

\~iS~btkmtv' 



5. Nolc that a conditional equation as defined above is different from a positive conditional equation of die 
ADJ {71], in which the condition in the axiom can be expressed using = positively. A conditional equation of 
the above form is called a restricted conditional equation in J43J. We have chosen such axioms because of 
simplicity, as even using positive conditional equationsas axioms dbesnot add to the expressive power of die 
specification language [43J. Furthermore, homomoiphisms do not preserve positive conditional equations. 



83 



3.1 .5 Specifying Nondeterministic Operations 

If an operation is nondeterministic, this is specified using the symbol 
nondeterministic following its range specification* as for the Choose operation of SeMnt in 
Figure 3.1. The behavipr of a nondeterministic operationis sjjeeijied in the same way as of 
a deterministic operation. The restrictions component may specify a, prpcpuditiQn, a set of 
required exception conditions, and a set of optional e^ep^on conditicms for a 
nondeterministic operation. For a nondeterministic observer returning many possible 
results on an input, the axioms do not specify the resulls; instead* they specify the 
properties of the results. For example, the axiom specifying the behavior of the 
nondeterministic operation Choose of SeMnt on an nonempty set s states that a result 
returned by Choose on s must be an element of the, set s. For a nondeterministic 
constructor, its behavior is characterized by specifyingjjthe ^resultfreturned by theobserveni 
on the possible values constructed by it 

If a boolean condition in a restriction is expressed using nondeterministic 
operations, we require mat for every input X< tb£ bppjeaa condition behaves 
deterministically, i.e., it returns either T or F. It ismeaniqg^e^ifpf a^fplejan conditio*} to 
return T as well as F on X: In case of aprecpj^p$n ft |h£ wstantiated boolean condition 
returning T as well as F would mean, that the inpj|t satisfies the precondition as well as does 
not satisfy the precondition. In case of an exception condition, this would mean that & 
signals or may? signal on the input as well as that a does not signal on the input 

For an equational axiom 'e & e 2 ' expressed using nondefejr^i^iftppfijaj^ 
use the following interpretation: For an instaij^a|^p|U^ ya^jajbjes in the axiom allowed 
by the preconditions and restrictions, the &t of r^ojis^e^^ 

e x is observably equivalent tp the pet of possible yah*e§ retuFneApy the; instantiated « 2 (Le., 
for evejry choice of nondeterministic operations jn^,^ value j^tunfed by the instantiated 
e x is observably equivalent to a value returned b> .the instantiated e 2 for some choice of 
nondeterministic operations in e r and vice versa). We have rejected another possible 
interpretation which is that for any choice of nondeterministic operations in both e and e v 
the values returned by the instantiated e. and e. are observably equivalent, because under 
this interpretation, the axiom does not hotdf when e~ and e 2 exhibit nondeterministic 



-84 



behavior; an equational axiom thus does not express any useful properly. If an axiom is a 
conditional equation '^ = if b then e y " where the boolean condition b involves 
nondeierminislic operations, then we requirethat for an instantiation of the variables x , .... 
jr, b behaves deterministically. As in case of a boolean concfition in a restriction, an 
instantiation of ^behaving nondeterministicatly and returning T as well as F does hot make 
any sense in a conditional equation. 

An alternate approach for specifying a nondeterministic operation would be to 
indirectly specify it by having the axioms specify its relation, which is deterministic. The 
relation can be specified using equations and conditional equations. However, the 
constraint that if the nondeterministfc operation returns a normal value on an input, then 
the relation holds for the input and at least one result, cannot be expressed in terms of 
equations and conditional equations. This can be circumvented by assuming that every 
such relation satisfies the above constraint If a nondetemfifnistic operation signals on an 
input, some convention about the behavior of the relation off such an input must oil 
decided. Using this approach, it is possible to specify the precise amount of 
nondeterminism an operation should have. However, we have adopted the former 
approach because of the following reasons: 

(i) We do not Want the specificatfon to specify liie precise amount of nondeterminism an 
operation should have: instead, we leave this decision to the designer of an 
implementation, 

(ii) it seems more natural to directly specify the behavior of an operation than specifying 
the corresponding relation, 

(Hi) the semantics of a specification designed using the latter approach would have to be 
derived indirectly, as should be evident front the discussion in the next section, and 

(iv) if we adopt the latter approach, the normal behavior of the nondeterministic 
operation would be indirectly specified by specftyinlr its rclatfenVwhereas fe exceptional 
behavior would be directly specified. We would 5 like to avoid using two notations for the 
sameconcept 

But one major advantage of adopting the latter approach is that we do not have to develop 
any additional formalism for nondeterministic operations. The theory devebped for 



85 



specifications specifying only deterministic operations applies to nondeterministic 
operations also. 

3.1 .6 Specification off Mutually Recursive Data Types 

A specification for mutually recursive data types is similar toa* specification for 
nonrecursive data types. Let & stand for an instance of a group of mutually recursive data 
types being specified. The specification is givenejtber the name of some data type in Dor 
a name different from the names of data types in 0. Like a specification of a npnreeursive 
data type, it has four components: 

(i) Operations, 

(ii) Auxiliary Functions, 

(iii) Restrictions, and 

(iv) Axioms. 

The Operations component specifies the syntactic properties of the operations of 0. It is 
divided into subcomponents. There is a subcomponent entitled D corresponding to every 
data type D in D specifying the operations of D. SoV a subcbfrtponentisf like the operations 
component of a nonrecursive data type as discussed above.- Besides, there is another 
subcomponent entitled Combined Operations, which specifies^ syntactic properties of the 
operations not belonging to any particular data type, but rather '■■to tile whole group 0. The 
remaining three components are the same as in a speciRfcatitih of a single data type. If 
does not have any combined operations, the speriikatidhs of data types In can be given 
separately like nonrecursive data types. However, the semantics of these specifications 
must be given together. 

Henceforth, we discuss only nonrecursive data types. From the following 
discussion, it should be dear how to extend the results andthe theory to mutually recursive 
data types. For instance, we can give the semantics of such a specification in a similar way 
as for nonrecursive data types (discussed in the next section) except that We will need to use 
type algebras defined in Section 2.4. 



86- 



3.2 Semantics of Specification Language 

The semantics of a specification S is defined to be a set of related data types. 
Each data type in the set isjsftid to be i specified by & J^J?^ stand fee this set Since a 
specification S refers to other specifications assuming them to be given, for example, the 
specification of Set-lnt refers to the specifications of Intend Bool, the semantics of S is 
given using their semantics. For a defining type D'€ A userfinS; we assume that & has a 
specification S' having a nonempty set of data types alsifesemartticsf D- stands for any data 
typeinD(S'). 

If S does not specify any nondeterministic operation; tteii every data type in O^S) 
can be shown to be deterministic. Operations of different data types in £>fB) share the 
common behavior specified by S. Different data types differ in me ^y&etf operations 
behave on inputs not satisfying the preconditions specified for the operations and/or bn 
inputs on which the operations are specified to have the option between signalling and 
returning a value. If the axioms do not completely capture the observable behavior of the 
operations, then data types in D(S) have operations iiaving different behavior on input on 
which the axioms leave their behavior unspecified 

In case S specifies nondeterministic operations, then data types in D(S) also differ 
in the amount of non^etermimsm their operations have. JX& |>as ^ta ;^p^in wbicji the 
operations specified to be nondeterministic arexfcte^ministic as welLas4atety|)es in which 
such operations have the maximum amount of nondeierminism allowed by S. For 
example, the semantics of the specificatioB of Set-lnt given in, Figure 3.1 has a data type in 
which the operation Choose is deterministic, returning the maximum integer in a nonempty 
set s passed as the argument to Choose. It also has the data type Sel-Int defined 4n ti$ 
previous chapter in which the Ctoosf ncadelerminisjjcaJJy p^cks any element of s. In 
general, a data type in D(Set-Int) has the operation Choose return nn element from a 
nonempty subset of s. 

The semantics of a specification specifying nondeterministic operations is thus 
necessarily a set of data types differing in the amount of aonid^^iii^ 
have, even if the specification does not specify any precondition or any optional exception 
condition for the operations and the specification completely specifies the observable 



87 



behavior of the operations. This semantics of a specification is chosen because of our view 
that a specification should not constrain an implementation to have any precise amount of 
nondeterminism, and that the decision about how much nondeterminism an 
implementation should have, be left to the designer of the implementation. Since a 
specification serves as an interface between the programs using the data type and the 
implementation(s), every theorem derived from the specification, as discussed in the next 
chapter, must hold for a correct implementation when interpreted appropriately. 

It is possible to write a specification in our language which specifies unbounded 
nondeterminism. (The term unbounded nondeterminism used here is different from the 
way it is used in [13, 35].) For example, in the specification of Nj (a version of the data 
type natural number) in Figure 3.3 specifies unbounded nondeterminism because the 
operation Pick is specified to have unbounded nondeterminism. For such a specification 
there does not exist any data type having maximal amount of nondeterminism. We will 
precisely state the condition when a specification S specifies unbounded nondeterminism. 
For a specification specifying bounded nondeterminism, we define data types having 
maximal amount of nondeterminism allowed by the specification. 

Instead of giving the semantics of S directly in terms of data types, we give its 
semantics as a set of (well formed) type algebras. Let F(S) stand for this set. We then 
partition this set using the behavioral equivalence relation on type algebras and get the set 
D(S) of data types. Each type algebra in F(S) is a model of some data type specified by S. 
We first assume that S does not use any auxiliary functions, i.e., A f = and A t = 0. 
Later, we discuss the semantics of S assuming that A f ^ and A t ^ 0. 

3.2.1 Specifications without Auxiliary Functions 

A type algebra in F(S) must have the syntactic structure as specified in the 
operations component of S and the observable behavior as specified by the axioms and the 
restrictions in S. F(S) is inductively defined; as in Chapter 2, we combine the basis and 
inductive steps into a single step. F(S) consists of all (well formed) type algebras of the 
form 

A = [{V D ,|D'€A'},EXV;{f ff |a€fl}] 



Figure 3.3. Specification of N* 

Operations 



O : -» Nj 




S : Nj -4 Nj 




P cNj^N! 




-> no-predO 




= : Nj X Nj -» Bod 


flSXjSXj 


> : N] X Mj -» Bool 


OS Xi^l] 


Pick : -f Nj 


nondeterministic 


Restrictions 




x = => P(x) 5/g/ra/5 no-pred() 




Axioms 




P(Sfx» = x 




x£x s T 

x>z = if(x>yAy> z)then T 
S(x)£x s T 
x £ S(x) s F 




x > S(y) a If ~ x > y then F 
x = y = (x>yAy^x) 
PickO > ■ m T 





such that A satisfies the restrictions and the axioms in S, where for each D' € A, V D < is the 
principal domain of an algebra A* '€ fl£S')- A' k a model ofa data type D' in D(S% 

We first discuss when a type algebra A satisfies restrictions; later we discuss the 
axioms. Let X = { x v . . . , x n } &aiid : : for all variables in an axiom or a restriction. Let 
V= { v r . . . , v n }, where each v. is a normal value of the appropriate type, stand for a 
A-instance of X, i.e., each v. is an instance of x. . 

3.2.1.1 Restrictions 

If a nontrrvial precondition P a is specified for a constructor a, then on an input V 
such that P ff [X/V\ interprets to F, f (v r .... v n ) either signals or returns a value 
constructible by the constructor functions using arguments satisfying their preconditions. 
It would be meaningless to allow f ff to return an arbitrary value that cannot even be 



89 



constructed For example, if a data type satisfying the specification in Figure 3.2 has its 
Push operation signal ©mfkw on stacks of size 128, it is absurd to let the operation Pop 
return a stack of size lOOOwhen applied on the empty stock, the input that does not satisfy 
the precondition specified for Pop. Similarly, if a is an observer, then t-fii^..;, v n ) either 
signals or returns a value in V^. , where D' is the resulttype of <r£ 

If the restrictions component specifies a required exception condition on a as 
R(X) => a(X) signals ex(e r . . . , e^ t 
then for every V, if both P tf \X/V\ and % [X/V \ interpret to T, then f % (V) must signal the . 
exception value ex(e g [X/ V\\ A , ..., <? fc [X/}% A ) for A to satisfy theaJ>ove restriction. 

If the restrictions component speci fies a to optionally signal an exception, Le., 
v(Xy signals e^e^. t .,e^ => pCJ), ^ 

then for every Ksuch that ? g [X/ V\ interprets, to T and X ff (K) signals the exception ex with 

the interpretations of e l [X/V] e t [X/V] as arguments to its handlers, O [X/V] must 

interpret to T for A to satisfy the above restriction. 

Since the restrictions are assumed to completely specify the exceptional behavior 
of the operations, for every operation a, the interpretation f w in A must be such that 
f o (v r . . . , v^is a normal value if (i) P a [X/JV} hol#, (ii) none of Vi$X/Y\ holds, and (ik) 
none of Q\X/V\ holds. 

3.2.1.2 Axioms 

A (behavioralfy) satisfies an equation '^ "a e 2 ' (or *e l m e 2 ' holdsin A) if and only if 
for every V, one of the following conditions holds: 

(i) The instantiation of e x or of e 2 interprets to an exception or is undefined, 

(ii) the input to an invocation of sohte I on v.',;. . . , v' does not satisfy the 

1 III 

precondition associated with a i\&., P^(vJ, . . ; , v^) trttfefpflets to F) when 
the instantiations of e and e % are interpreted, and 

(iii) { e i [X/ H £ } is observably equivalent to { e 2 [X/ V\i A }. 

In the previous section, we informally described the semantics of conditional 
equations using the auxiliary functions if-then-eke. Here we formalize the discussion. To 



90 



check whether a conditional equation *e, -m 'tibthmej holds in A* we extend A to include 
the interpretation of the auxiliary function if-thcn*etse : Bool x D' X D' -* D 
corresponding to every D' € A'. The interpretation f^.i^.^ in tfie extended algebra has 
the following behavior: 

'if-then-else'*' v r V "" v r 

'if-then-eke'*' v r V "~ v r 
The interpretation of a conditional equation involving if-then-else can be verified to be 
equivalent to interpreting (he formula 'b =■ T =* (e =r e^ as we require that b behave 
deterministically for every A-instance. Henceforth, we view a conditional equatibn as a 
formula 7> => e 1 = e^ so that we do not have to consider the auxiliary functions if-then-else. 
If a type algebra A is in F(S), then we say that A betiaviomtly satisfies S, and call 
A a model of the specification S. Note that A satisfies the axioms under the interpretation 
of the symbol '=' as the observable equivalence relation on the domains of a type algebra. 
If a model A of S satisfies the axioms interpreting '=' as the identity relation as in Logic, we 
say that A identically satisfies S. 

For example, the models A s| and A*, of the data type Set-Int discussed in 
Chapter 2 can be shown to be in i^SeMni). So, they are also the models of the 
specification of Set-Int given in Figure 3.1. A . identically satisfies the specification of 

91 

Set-Int. It should be easy to see that every reduced algebra in f(S) identically satisfies a 
specification S because the observable equivalence relations are the identity relations. 

Using the fact that the set E of observable equivalence relations on the domains in 
A above is a congruence, we have 

Thm. 3.1 A € HS) iff A/E € flpS). I 

So, to check whether a type algebra A is in J^S), we can check whether its reduced algebra 
A/E identically satisfies S. Using the above theorem, we get 

Thm. 3.2 If A € F(S), then every type algebra behaviorally equivalent to A is in F(S). I 



91- 



3.2.2 Specifications with Auxiliary Functions 

An auxiliary function is not a part of a data type v so a model in '${&) cannot have 
any interpretation for die auxiliary functions. We first define an extended ^Aa^ type ^ 
from D, whose operation set is Qu A f and the set of (kfmm§ typhis A U A^ If the 
Auxiliary Functions component is incladed in the *^ra*te^c»mporient In S, fie *nadified 
specification S } is a specification of data typerhaving the same syntactic structure as D r 
and S x does not use any auxiliary functions. We define FtSJ for the modified 
specification S 2 as discussed above. An algeb/a A x of type D L in F(SJ is 

A J = [{V^ID' €A'U A f } ;l%\eem^n i*# 
So an auxiliary term can be interpreted irt A x . The axioms in S expressed using the 
auxiliary functions in A f hold in A x . 

For every algebra A 1 of type D } in Fg^), j# obtain an algebra A of type D in 
F(S) as follows: 

A = [{V^ID'eA'jvif^i^oH 
where for each D' € A, V D , = V^-, and V D C V^ . A function M*|i restriction of f£ tojhe 
domains of A such that \ D \s the smallest set cjosediunder fin^teiy ri^y ; a|$icaiions of the 
functions in { f Q \ a € a c }. V D can be a pj^per«»ibset r pf $#. because S may use an 
auxiliary function having D as its range that constructs some extraneous values (see P#j#r 
an example of such a specification). 6 

For example, the model A^of the data type SlkInt-100 discusseddn Chapter 2 
can be shown to be in HSjUt-Jnt), We must exjeii^A^^o^lu^e theinteJEpreJ^onSi of 
the auxiliary function Size such that SK</ r . . . , / m >) = nvajncfeuse the extended algebra 
for proving that it satisfies the axioms and restric|k>n|»F|guj^34. 



6. Jfowever, we do not encourage specifications in which auxiliary functions arc of result type D and 



generate values not constructiblc by the constructors otti 



.92 



3.2.3 Semantics of a Specification 



Using Theorem 3.2, we partition F(S) using the behavioral equivalence relation 
on type algebras, and get the set D(S> of data types as the semantics of S. A reduced 
algebra m every equivalence class in the partition on fl(S) can serve as a representative of 
the data type defined by the equivalence class. This can be pktoriatty expressed as 

S 



o(S) * {a 





*V>) = l ^H'-'^iin *" \l *" *kSL "**• ■* 



where D x D k are the data types in D(S), and A u , .'. . , A,_ , .... are the 

models of a data type D k . 

It should be clear from the discussion in the last two subsections that the 

operations of different data types in S(S) share the befiavidr specified by S. However, they 

differ in 
(i) the amount of nondeterminism they have, if specified to be nondeterministic by S, 
(fi) their behavior on inputs not satisfying the preconditions specified by S, 
(Hi) their behavior on inputs satisfying the precondition^ and optional exception 

conditions specified by S, and 
(iv) their behavior on inputs on which their behavior is unintentionally omitted in S. 

If S specifies a to optionally signal on a subset of inputs, «r for different data types may or 

may not signal for some of the inputs in the subset If the constructors are specified to 

optionally signal for expressing the size requirement on the values of a data type, different 

data types have different upper bounds on the size of their values. 

For example, D(Set-Int) defines different data types in which Choose behaves 

differently because it has different amounts of nondeterminism, as was discussed earlier. 



93- 



D(Stk-Int) has different data types whose operations Pop and Replace have different 
behavior on the empty stack, and the operation Push behaves differently on stacks of 
size > 100. Some of the data types differ in the maximum size allowed of the stacks. The 
data type Stk-Int-100 defined in Chapter 2 is in D(S). 



94 



3.3 Specification of a Data Type and Equivalence of 
Specifications 

Def. 3.2 A specification S specifies a datatype D ifTD€ 0(SHj,e M ;M D C US)). 7 I 

If a specification S specifies the data type D, the specification need not be precise in the 
sense that it may not completely specify the behavior of D; a portion of the behavior may 
not be, in fact, captured by S at all There may be data types in 0(S) difFerent from D. We 
introduce the following stronger definition for specifications specifying deterministic 
operations only. 

Def. 3.3.1 S precisely specifies D iff D(S) = { D } (i.e., M D = f(S)). I 

The above definition requires that the specification of a defining type D' € A also precisely 
specifies D'. 

For a specification specifying nondeterministic operations, its semantics has data 
types differing in the amount of nondeterminism their operations have, nondeterminism 
allowed by S. We define a partial ordering on type algebras in F(S) which orders data 
types in D(S) based on the amount of nondeterminism in their operations mat are specified 
to be nondeterministic by S. Instead of comparing two arbitrary type algebras in F(S), it is 
convenient to compare algebras having the same domains but differing in their functions. 

Def. 3.4 Given two type algebras A and A' of D 

A = [{V D .|D'€A'},EXV;{fJo€G}] 

A' = [{V D ,|D'€A'},EXV;{f;|a€Q}J, 
A' is at least as nondeterministic as A, expressed as A < . A', if and only if 
for every operation a € a, and for each v„ .... v , 

{ Vv . . . . V > c t £(»! v n )>- ■ 

Informally, the above means that every function in A' is at least as much nondeterministic 



7. Recall that Mp is the set of all models of the data type D. 



95- 



as the corresponding function in A. We say that A < nd A' if and only if A < nd A' and there 
is at least one nondeterministic function f ' in A' such that for some v , . . . , v , 

<J in 

{f <7 (v 1 ,...,v n )}c{f ff (v 1) --.v n )}and{f a (v 1 ,...,v n )}^{r(v r --.v n )}. 
We can order the reduced models in F(S) using < nd relation. 

Def. 3.5 A reduced model A in F(S) has maximal amount of nondeterminism allowed byS 
if and only if there is no reduced model A' in F(S) such that A < nd A'. I 

If a reduced algebra A € HS) has maximal amount of nondeterminism allowed by S, then 
it can be shown that any algebra behaviorally equivalent to A also has maximal amount of 
nondeterminism allowed by S. Using this, we get 

Def. 3.6 A data type D £ D(S) has maximal amount of nondeterminism allowed by S if its 
reduced model has maximal amount of nondeterminism allowed by S. I 

For example, the model A . has maximal amount of nondeterminism allowed by 

SI 

the specification of Set-Int in Figure 3.1, so the data type Set-Int defined in Chapter 2 has 
maximal amount of nondeterminism allowed by the specification in Figure 3.1. It is easy to 
see that no model of the specification of N^ in Figure 3.3 can have maximal amount of 
nondeterminism; given any model A, we can find a A' such that A < nd A'. 

Def. 3.7 A specification S specifies unbounded nondeterminism if and only if D(S) is not 
empty and there does not exist a data type in D(S) with maximal amount of 
nondeterminism allowed by S. ■ 

So, the specification of Nj specifies unbounded nondeterminism because of the operation 
Pick. The specification of Set-Int specifies bounded nondeterminism as there are data 
types with maximal amount of nondeterminism allowed by the specification of Set-Int in 
D(Set-In(). in this thesis, we have considered data types with operations having only finite 
nondeterminism, so we are interested in specifications that specify bounded 
nondeterminism. Henceforth, we assume that a specification S does not specify 
unbounded nondeterminism. 

In case of a specification specifying nondeterministic operations, we have 



96- 



Def. 3.3.2 S precisely specifies D if { D } = { D -I D € DCS) and D has maxima] 
amount of nondeterminism allowed by S}. 1 

The above definition also covers the case 3.3.1 above, as in case of a specification specifying 
only deterministic operations, the set f D | D € D(S) } is the same as D(S). For 
example, the specification in Figure 3.1 precisely specifies the data type Set-Int defined to 
Chapter 2, whereas the specification in Figure 3.2 does not precisely specify the data type 
Stk-Int-100 defined in Chapter 2. 

We can also show that a specification S is correct w.r.L a model A by showing 
thatA€F(S). 

We can define equivalence among specifications as follows: 

Def. 3.8 Two specifications S x and S 2 are equivalent, expressed as S x = S 2 , iff 
DfS,) = D(S 2 )(i.e., ^ = HSJ). I 

Note that we do not make any distinction between a specification in which the 
constructors are 'completely' specified and another specification in which some of the 
properties of the constructors are not specified. For example, the specification of Set-Int 
does not specify the property of Insert that the order in which integers are inserted does not 
matter. The specification in Figure 3.1 is equivalent to the new specification obtained by 
adding the fbl lowing axiom because both have the 8»ne semantics: 

Insert(lnsert(s,il>,i2) s if il * i2 then InserKs, il) ebe Insert<lBsert(s, 12), il>. 
However, as we discuss in Chapter 4, it is possible to prove nibie properties about Set-Int 
using the specification with the above axiom than the specification given in Figure 3.1. We 
distinguish between the two specifications there, and define a stronger equivalence relation 
on specifications which incorporates this distinction. 

We have discussed above one way of precisely specifying a data type D. As stated 
in the beginning of this chapter, D can be presented in many ways. 8 One way is to present 



8. Wc have deliberately used the word 'presented' insttad of 'specified' to avoid confusion, as we have 
precisely characterized above when a data type can be specified. 



97 



a representative model A and define the semantics of such a presentation to be { A' | A' is 
behaviorally equivalent to A }, as in [3]. There could be other ways of presenting data 
types. If the semantics of these methods can be given in terms of type algebras using our 
formalism, we can relate specifications given using different methods (see discussion in 
Section 3.6). 



98- 



3.4 Specification of Bool 

In Chapter 2, we defined the data type Bool which serves as the basis of our 
formalism. Figure 3.4 contains a specification of Pool; this specification cannot be 
expressed in the proposed specification language because it has an inequality 

T d F 
as an axiom. This axiom is introduced to capture the property that the boolean constants T 
and F are distinguishable from each other. The semantics of the specification is the data 
type Bool; it can be verified that every axiom in the specification holds in a model of Bool. 
Because of the inequality, we do not need to introduce inequalities in the specifications of 
other data types; we will show in the next chapter (Subsection 4.2.3) how to deduce them 
using the above inequality. The specification of Bool is assumed to be given. 



Figure 3.4. Specification of Bool 
Operations 



T 


: -* Bool 




F 


: -* Bool 




not 


: Bool -* Bool 


as ~x 


or 


- Bool X Bool — Bool 


05 *1 VX 2 


and 


Bool X Bool -» Bool 


as x 1 Ax 2 


implies 


Bool X Bool - Bool 


as x 1 **'X 2 


eqv 


Bool X Bool -» Bool 


as x 1 ~x 2 


Axioms 






T * F 






~T s F 






~F s T 






iVysi 


tVx 




xVTs 1 


r 




FVF s 1 






x Ay s * 


- «~ x) V (~ y)) 




(x => y) = 


(~x)Vy 




x«y = ( 


x =* y) A (y => x) 





99 



3.5 Properties of a Specification 

We discuss two properties of a specification, namely consistency and behavioral 
completeness, based on its semantics. These properties are different from the consistency 
and sufficient completeness properties defined by Guttag and Horning [28], which are 
proof theoretic (i.e., based on what can be deduced from a specification). We discuss the 
relationships between the properties introduced in this section and the properties defined 
by Guttag and Horning th the next chapter. 

Consistency and behavioral completeness are botn structural properties; they 
ensure proper relationships among different components of a specification. Generally 
speaking, consistency means that a property assumed already is not invalidated. In this 
case, it means that properties expressed in the specification of a defining type or an 
auxiliary type, or the assumptions made about t&e way the exceptional behavior of the 
operations be specified, are not invalidated. It ensures te a specification specifies at least 
one data type. 

Behavioral completeness captures the intuition that a specification completely 
specifies the observable behavior df fte operattons on the Intendwf inputs (i.e., inputs 
satisfying the associated precondition^ A de^gher of a specification intentionally leaves 
the operation b^avior unspecified fey associating pretentions i and optional exception 
conditions witfl the operations. Apart from intentional incompleteness, a specification may 
be incomplete because' the designer ^ unintentaonafty omitted some axioms. The behavioral 
completeness property ensured that a specirfcalon J ^oniy ihtenSonally Incomplete. ISo, it 
warns against any oM$sion> It is a desirable propeityiotmostof the specifications. _ . , 

We first discuss the consistency property; later, we discuss the behavioral 

1X71 lis m'iViSiikv :■?>?* ■*■■■ >-'i'^'i 

completeness property. 

3.5.1 Consistency & 

A specification S is, informally speaking, inconsistent 
(i) if S specifies ground terms of a defining type (or an auxiliary type) that arje specified 
to be distinguishable by its specification, to be obsejs$^fc#jpwleii£e^ * fiH 



100 



(ii) if S specifies ground terms of a defining type (or an auxiliary type) that are specified 
to be observably equivalent by its specification, to be distinguishable. 

An example of the first case would be a specification S using the specification of Bool and 
specifying T and F to be observably equivalent An example of the second case is the 
specification of EX1 given in Figure 3.5. The data type EX 1 , has only one value. The 
predicate P distinguishes among observably equivalent ground terms of Set-Int; P returns 
T if and only if in its set argument, an integer has beer) inserted more than once; otherwise, 
it returns F. This property of the set values is not observable by the operations of Set-Int as 
specified in Figure 3.1. 

In either case, S does not have any models, i.e., f=(S) = 0. In the first case, no 
type algebra can satisfy S because one of the axioms would want two distinguishable values 
in the domain of D' to be observably equivalent Jn the secpmjl case, S does not have, any 
models because of the well formedness property of a Jype algebra (wljjch is that the set of 
observable equivalence relations is a congruence). 

EX1 cannot be implemented in any programming language in which an 
implementation of a data type is hierarchically structured and [the representation of a data 
type is hidden from the users of the data type, since only.t^ 

can be observed. Thus the predicate ■■ ?.^!aa^„fa M _.j$f{tyefl^ because the 
implementation of P must distinguish between, for example, the observably equivalent 
ground terms lnsert(lnsert(0, 0), 0) and lnsert(0, ®. Ppjajriar ^7J has aljSQ discussed such a 
violation by a specification S of the specifications of the idetning types. He said such a 



Figure 33. Specification of EX1 

Operations 

a : -> EX1 

P : EX1 X Set-Int -» Boot 

Axioms 

Wa, 0) se F 

P(a, InserUs, 0) =s ifi € s then T else P(a, •) 



101 



specification had projection errors. 

A specification can also be inconsistent because tke exceptional behavior of the 
operations is not properly specified, for example, the boolean conditions in exception 
condition restrictions may not be disjoint 

Def. 3.9 A specification S is consistent if and only if (!) the specification S' of D', for each 
D' e A U A t , is consistent, and (ii) D(S) is not the empty set I 

A specification S defines observable equi valence relations ofr ground terms just 
like a data type does. By a term here, we mean a term constructed without using auxiliary 
functions. 

Def. 3.10 S specifies two ground terms e 1 and e 2 of type D' £ A' to be observably equivalent 
(or e l and e 2 are observqbly equivalent ty>S>iff ^ ande f are observably equivalent in every 
data type in 0(S) (i.e., the possible interpretations of ^ in a model A € f^S) are observably 
equivalent to the possible interpretations of e 2 in A). * 

Def, 3.11 S specifies e % and e 2 to be distinguishable \$e t and e 2 are distinguishable in every 
data type in D(S) (Le., the possible interpretations of e x in a model A in f(S) are 
distinguishable from the possible interpretations of e 2 in A). I 

For example, lnsert(lns€rt(0, 1), 1) and Insert(#, 1) are specified by the specification of 
Set-tot to be observably equivatent ttsert(0, 1) arid lrt&rt(B>, 2) are distinguishable. 
However the specification in Figure 3.2 does not specify 1 Pop^NuW) and Null to be 
observably equivalent or distinguishable. If S is inconsistent there are ground terms which 
are both observably equivalent as well as distinguishable by S, because 'flgS) is the empty 

set 

Since a specification S may leave the behavior of operations unspecified on 
certain inputs using the precondition and/or optional exception restrictions, there may in 
general exist ground terms of type D' € A' which are neither specified by S to be observably 
equivalent nor distinguishable. For example, PopfNuH) is neither observably equivalent to 
Null nor distinguishable from Null by the spedficatibn Of Sfk-lnt in Figure 3.2, as a data 
type in D(Stk-Int) may have Pop return Hie empty stack itself when irivoked on the empty 



-102- 



stack and another data type in D(S) may have Pop signal on the empty slack. Ground 
terms involving nondeterministic operations may also be neither observably equivalent nor 
distinguishable by S; for example, the ground term Cboosc(lnsert(Inscrt(Niiil, 1), 3)) is 
neither observably equivalent nor distinguishable from 3. The above observable 
equivalence and distinguishability relations capture the common behavior of data types in 
D(S). 

3.5.2 Behavioral Completeness 

In the definition of behavioral completeness, we must capture the intentional 
incompleteness of a specification. If a specification S associates a nontrivial precondition 
with an operation, different data types in D(S) can have such an operation behaving 
differently on an input not satisfying the precondition. If an operation is specified to have 
an option to signal when its input satisfies a condition, different data types in 1J(S) can have 
such an operation signalling the specified exception or terminating normally on an input 
satisfying the associated condition. If S specifies a nondeterministic operation, different 
data types in 0(S) can have such an operation having as much nondeterminism as desired. 
This incompleteness in S is intentional. Any othef difference in the behavior of data types 
in D(S) is unintentional. 

The above means that for a specification S to be behaviorally complete, data types 
in 0(S) having maximal amount of nondeterminism allowed by $ must have the same 
observable behavior on intended inputs, except that if there is an optional exception 
condition specified for an operation, then the opeiation has the option of signalling or 
terminating normally on an input satisfying the boolean condition in the optional exception 
condition. 

We define three relations on the models in flflS). The partial isomorphic 
equivalence relation formalizes the intentional incompleteness introduced due to the 
nontrivial preconditions specified for the operations in S. The isomorphic embeddabtiily 
relation formalizes the intentional incompleteness due to the operations specified to have 
the option to signal exceptions. Later we combine them to define the partial isomorphic 
embeaaabilily on reduced models in fl[S). We us& gie partial isomorphic embeddability 



103- 



relation to define the behavioral completeness of a specification by relating the reduced 
models of data types in D(S) having the maximum amount of nondetermiriism allowed by 
the specification S. 



3.5.2.1 Partial Isomorphic Equivalence 



Let P be a precondition specified for a in S. Let S' be the specification of a 
defining type D' € A in S. The partial isomorphic equivalence relation relates models 
whose operations have the same behavior on inputs s^sfyin^t|iei?precon4Wo^. The 
definition is obtained by modifying the definition of isomorphic equivalence (Def. 2.13) 
given in Chapter 2. As in Chapter 2, we assume that the domains corresponding to each 
iy € A in models A l and A 2 are defined by the isomorphically equivalent models in F{S') 
and that the isomorphic equivalence relation on thesfc models; in F$?) induces a btjection 
* D .:V^V£,. 

Def. 3.12 Given two algebras A x and A 2 in f=(S) 

A 1 = l{V^|D'€A'} v EXV 1 ; {£|a€0.}]' 

A 2 = [{V^|D'€A'},EXV 2 ; {£|a€Q}] 
such that for each D' € A, Vjy and Vjy are the value sets defined by isomorphically 
equivalent models A^ and A 2 in F\S% where S' is a specification of D', and *# : Vjy-+ V D < 
is a bijection induced due to the isomorphic equivalence of AJ and A 2 , A 2 and A 2 are 
isomorphically equivalent w.r.t. {P ff |a€Q} (or w.r.t S) iff there are bijections 

*D y D-* V D and *EXV : EXV l "* EXV 2 SU ^ h ** * = { *D ' D € A ' } U { *EXV > haS 

the following properties: 

(i) For each ex : D x x ... x D n , and for every v l of type D r ..., v n of type D n , 

* EXV (ex(v v ..., v n )) = ex(* D (Vj), .... * (v^)), and 
(ii) for each a € Q, a : Dj X . . . X D a -» D', 
for every v l of type D r . . . , v n of type D a , if P (v r . . . , vj> = T, then 
(a) if neither f 1 nor \\ signals, then 

{ ♦ D < f >i V n» > = * f a<*D/ V i>' • • ' • *l> n < V n » > ; Ofo*™*' ' 

< b > W f >r • • • • v „» = tfrv < v i>' • • • . *b <tf* '■ 



104 



We also call A 2 and A 2 partially isomorphically equivalent, when { ? 1 a € } is evident 
from the context 

The reason for requiring * D to be a bijection (and not a partial one to one 
function) is the assumption that for the case when a constructor is specified to have a 
nontrivial precondition, if it terminates normally On an input ttbt satisfying itsprecoTKKtibn, 
the value returned can be constructed by the constructors using inputs satisfying their 
preconditions. 

3.5.2.2 Isomorphic Em beddability 

In the definition of isomorphic embeddability relation, we want to capture the 
intuition that if a specification S associates an optional exception condition with an 
operation a, then on an input X satisfying the associated boolean condition 0(X), the 
function corresponding to a either behaves the same in different algebras in I^S) (ie., it 
either returns the 'same' value or signals the 'same' exception value), or the function 
behavior differs in different algebras to the extent that in one algebra, the function signals 
the desired exception value and in the other, the function returns the* desired normal value. 
The condition (iii) in the definition below captures this. 

If any constructor a is specified to optionally signal, men the value set of D 
defined by one algebra in F(S) may be a subset of the value set of D defined by another 
algebra in F{S). (In fact, one value set may have a value that is distinguishable from every 
value in the other value set) That is why in the definition below, we do not require the 
mapping relating the value sets of D in two algebras to be a bijection; instead, it is required 
to be a one to one partial function.' However, the mapping must be defined for every 
value constructed by the function corresponding to a constructor a using inputs which 
satisfy the associated precondition and do not satisfy any boolean condition stated in a 
required exception condition or an optional exception condition specified for a. This 
constraint is captured in the condition (i) below. 



9. That is also the reason for calling the relation isomorphkally emboddable. 



105 



Def. 3.13 Given two algebras A x and A 2 in F(S) satisfying the requirement about the 
domain corresponding to D' € A stated in Def 3.12, A 2 is isomorphieally embeddable in A 1 
w.r.t. S ifT there exist 1-1 partial functions $ D ; V^ -* V£. and f^ : EXV X -* EXV 2 , with 
the following properties: 

(i) for every set of values v r .... v n , for a constructor a, if 

(a)P ff [yv r ..., x/vj holds, 

(b) for every required exception condition specified for a, its boolean condition 
R [x,/v, ..., x /v 1 does not hofd, and 

r 1 1 ' n r 

(c) for every optional exception condition specified for a, its boolean condition 
0{x,/v„ '..., x /v ] does not hold, 

j* 1 1' n n 1 

then 4> D is defined for every value f^(v 1 v n ), 

(ii) for every exception name ex : J)' t x„. X D^ , .... 

*EXV te^r v ^ = «(*d^ v P* '"' % '^ %$\£^ is-defined for each 

1 in i 

l<i<m, and 

(iii) for each a € G, for every set of values v r . . . , v n such that * D (v.) is defined for each 

i 

l<i<n, 

(a) if on v r . . . , v n , f£ signals an exception value ex(v' v . . . , v^) specified to be 

optional by S, then the associated condition OiJCp. .., X n ) holds on v^ . ... v n , and 
f a ( *D (v i> *D < v n^ cither signa,s eJ ^b^ - ; ; '+&<?& m returns * D <$0 far 

1 n "1 in 

some V,or 

(b) if * D '(vp* ..., * D (v^) are defined and fj signals an exception vatee 

1 m 

e^* D ,(vj), . . . , * D (^)) specified to be optional by S oh input * D ty), .... * D (v n ), 
then the associated condition 0.(jt, ..., X a ) holds on $ D (v x ), .... * D (v„). and 
^(v , . . . , v n ) either signals ex(v' v ..., W) or returns V\ otherwise, 

For example, let us modify the model A gtk discussed in Subsection 2.3.2 so that 
the function corresponding to Push signals overflow if sequence size is 128, instead of 100, 
and call the modified model A^ tR . It can be shown that A stk is isomorphieally 
embeddable in A^ tk . A^ tk is 'bigger' than A gtk because the value set corresponding to 



-106- 

Stt has more elements in A^ tk than in A ft|k . When optional exception conditions for 
constructors are specified to state a least upper bound on the size of the values of the data 
type, as in case of the specification .of Stfclnt in Figure 12, diiterent algebras in F(S) may 
have different upper bounds on the size of the values in their value sets/ 

3.5.2.3 Partial Isomorphic Embeddability 

We combine the notions of partial isomorphic equivalence and isomorphic 
embeddability to define another relation. The new rclatioi* captures both kinds of 
intentional incompleteness, due to preconditions as well as due to optional exception 
conditions. 

Def. 3.14 A t is partially isomorphically amb&khbte w.r.t S in A 2 if and only if there exists 
a model A' in f(S) such mat A' is 'partially isomorphically equivalent to A 1 and A' is 
isomorphically embeddable in A 2 . ■ 

3.5.2.4 Definition of Behavioral Completeness 

We define behavioral completeness of a specification by reteting the reduced 
models of the data types having maximal amount of nondeterminism allowed by S in 0(S) 
using the partial isomorphic embeddability relation. The definition of behavioral 
completeness is a single level definition in the sense that a specification S can be 
behaviorally complete irrespective of whether a spedfieajapnHOf a defining type m S is 
behaviorally complete. If the specification of a defining type is.behaviprally incomplete* 
the incompleteness will be reflected in the semantics of a behaviorally complete S. So, in 
the definition, we consider only reduced models in F{S) that have the domains 
corresponding to each D' € A defined by the isorriorphically equivalent models in f{S% 
where Sis a specification of D'. 



107- 



Def. 3.15 A specification S is behaviorally complete W ft) S is inconsistent, or (ii) for any 
two reduced models A } and A 2 in F(S) having maximum amount of nondetermimsm 
allowed by S and whose domains corresponding to each D'€ A are defined by the 
isomorphically equivalent models in f{S% where S' is a specification of D', A t is partially 
isomorphically embeddable in A 2 or vice versa. • 

The reasons for having the first case this way in the above definition are that for 
an inconsistent S, F(S) = 0, so any relation amoiig algebras in F(S) holds, and that we 
want our definitions to be compatible with the definitions of consistency and completeness 
in logic, in which an inconsistent theory is ccmipFete. 

For examples, the specifications of Se^nt^Stk-Iirt, and Bool in Figures 3.1, 3.2, 
and 3.4 respectively can be shown to be behaviorally cOmplbte. Note that any specification 
not specifying any observers is trivially behaviorally cbfnpleee. We can show the following: 

Thm. 3J For a specification S specifying only deterministic operations and not specifying 
any precondition or an optional exception condition for an operation, a consistent S is 
behaviorally complete iff S precisely specifies a data type D assuming that the specification 
S' of every D' € A precisely specifies D\ 

Proof The above definition of behavioral completeness reduces under the stated 
conditions to requiring that the reduced models in F(S) are isomorphically equivalent 10 
This means that fl(S) = M D . 
Hence the theorem. I 

The behavioral completeness property guarantees that the behavior of the 
operations has not been left unintentionally unspecified. However, there are situations 
when the behavioral completeness requirement on specifications is restrictive [31, 51J. For 
example, consider a modified version of the specification of Set-Int in Figure 3.1 in which 
Choose is not specified to nondeterministi?. In such a specification also, we do not wish to 



10. If a specification docs not specify a nontrivial precondition for an operation and also docs not specify any 
optional exception condition, the partial isomorphic cmbcddability relation reduces to isomorphic 
equivalence. 



-108- 



commit to the value Choose may return on an nonempty set, so the axiom specifying 
Choose is still 

Choose(s) € s s T. 
This specification is not behaviorally complete. We would want such a specification to be 
behaviorally incomplete, as otherwise Choose must be completely specified The 
behavioral completeness requirement is restrictive in such a case because the reduced 
algebras in the semantics of the modified specification are not isomorphically equivalent 
For example, in one reduced algebra, the function corresponding to Choose when applied 
on { 1, 3 } may return 1, while in another reduced algebra, the corresponding function may 
return 3. For most specifications,specifying nondeterministic operations, if we modify such 
a specification so that an operation specified originally to be nondeterministic is instead 
specified to be deterministic, then we would often, want the modified specification to be 
behaviorally incomplete, 



109 



3.6 Comparison With Related Works 

We compare our specification language with those of Guttag et al. [29] with 
extensions proposed in [31], Zilles [77], the ADJ group [22, 23], Burstall and Goguen [7], 
Goguen and Tardo [21], and Nakajima et al. [62]. We first discuss the capabilities of these 
specification languages and the approach used to give their semantics. Later, we compare 
the semantics of a specification in these languages. 

Zilles [77] and ADJ [23] do not allow auxiliary functions in a specification, so their 
languages have a limited expressive power. Zilles [77] assumes that the operations of a data 
type are deterministic and that they do not signal exceptions. The ADJ [23] do not allow 
nondeterministic operations either; they adopt the simpler approach discussed in 
Subsection 2.3.3 for modeling exceptions, and discuss a specification language embodying 
this approach. Goguen [20] extended the ADJ method of modeling exceptions, which we 
compared with our approach in Subsection 2.3.2. His approach for specifying exceptional 
behavior of the operations is different from our approach; it is motivated by the view that 
exception values are like normal values (and so they are typed). The exceptional behavior 
of the operations is specified using equations. Our language is richer than his language 
because of the preconditions and the distinction made between optional exception 
conditions and required exception conditions. His semantics of the specification method is 
complex. 

Burstall and Goguen's [7] CLEAR language and its extension, the OBJ language, 
support hierarchical structure and modularity like our language. However, Burstall and 
Goguen have ambitious goals; they are attempting to develop a general purpose 
specification language based on algebraic semantics in which the semantics of a 
programming language can be specified. So they are forced to introduce complex 
mechanisms, for instance, procedures operating on theories, which make the specification 
language hard to understand. The category-theoretic semantics of their language is also 
complex [30]. Our approach instead has been to concentrate on the data component of 
programs, and develop a specification language and a formalism for data types. Our 
semantic method is simpler. 

Guttag et al.'s work [29] is the closest to our work. Their language is limited as it 



no 



cannot specify data types with nondeterministic operations. As was said in Section 3.1, our 
specification language is an enrichment of the specification language in [31]. Our 
formalism can provide a semantics for their specification language. Our formalism can also 
be used to provide a mathematical basis of the AFFIRM system [60, 61]. In this sense, our 
formalism places their work on a firm basis. 

Nakajima et al. [62] specify a data type, as discussed in Chapter 1, as a first order 
theory. Their method differs from other methods including our method because they allow 
any first order formula to be an axiom in a specification. Auxiliary functions are not. 
allowed in a specification. Operations are assumed to be deterministic; they do not signal 
exceptions. We have not yet seen the semantics of their specification language. If we 
assume that a first order theory is interpreted in a standard way as in Logic [16], the 
problems with this approach are discussed in the related work section of the first chapter. 
We further comment on their specification method in the next chapter from the point of 
view of deducing properties from a specification. 

Burstall and Goguen, Nakajima et al„ and Guttag [31] can specify a type scheme 
(also called a parameterized type) in their languages. Recently, the ADJ group [71] has 
given a category theoretic semantics of a parameterized type. Our specification language, 
as it is, cannot express a parameterized type. However it should be evident from the 
discussion that our formalism as well as specification language can be easily extended to 
parameterized types. We discuss these extensions in the last chapter of the thesis. 

There are differences between our semantics of a specification, and those of 
Zilles, the ADJ group, and Guttag et al. [28], which are motivated by different definitions 
of a data -type used in various formalisms. Zilles and the ADJ assume that values not 
specified to be related by the axioms are different, even if they are observably equivalent 
Guttag et al. on the contrary assume that the values are equivalent unless specified to be 
different We have taken a different approach; we consider the axioms as specifying the 
observably equivalence relation. Our approach towards the semantics of a specification is 
similar to the one adopted in logic; we consider all models of the axioms to be the 
semantics of the specification. (Of course, we consider only die algebras satisfying the 
minimality property for modeling data types, and rule out nonstandard models.) Our 



Ill 



semantics thus subsumes Zilles's and the ADJ's definitions, as well as Guttag et al's 
definition in the following way. 

To understand the semantics of a specification in the ADJ group formalism as 
well as in Zilles's formalism, we introduce the following definition. As is stated in 
Subsection 2.2.6, the models in F(S) can be partially ordered using the onto 
homomorphism relation, i.e., A 1 < A 2 if and only if A is a hdmomorphic image of A 2 . 

Def. 3.16 A model A in f^S) is called initial if A is a maximal model with respect to the 
homomorphism relation, and A identically satisfies S. I 

In an initial model A, Vjy for each D' € A is a value set defined by an initial model in f^S*), 
where S' is a specification of D'. Two members in V D are wot the same unless they are 
related by the axioms and restrictions. The ADJ group and Zifles define the semantics of a 
specification S to be the set of initial models in f^S). Guttag et al.; on the other hand, 
define the semantics of a specification S to be the set of reduced models in fl(S)l. 



-112- 



4. Deductive System 

In this chapter, we develop a deductive system for abstract data types. The 
deductive system embodies general properties of data types which are not explicitly stated 
in a specification but assumed in the semantics of the specification language. We construct 
a theory of a data type, which is a collection of properties of the data type, from its 
specification. The theory of a data type can be used in reasoning about programs and 
designs that use the data type in the same way as the properties of natural numbers are used 
in reasoning about programs operating on natural numbers. In particular; the correctness 
proof of an implementation of a data type with respect to its specification as discussed in 
the next chapter, involves the use of the theories of its defining types and the theory of its 
rep, the data type whose values are used to represent the values of D in the 
implementation. We can pose questions about the behavior of a data type and check 
whether they can be answered from its specification according to our intentions using the 
deductive system. In this sense, constructing the theory of a data type can enhance our 
confidence in its specification. 

The construction of the theory of a data type from its specification has an 
important advantage that the theory does not depend on any particular implementation of 
the data type. The correctness criterion used for implementations in Chapter 5 guarantees 
that every property, in the theory is satisfied by every correct implementation. We can thus 
reason about programs using a data type abstractly without referring to any particular 
implementation of the data type. This separation between the theory of a data type and its 
implementations via the specification factors the proof process in to two independent parts: 
(i) Proof of use of a data type, and (ii) proof of correctness of implementation of a data type 
[37]. In this chapter, we discuss the first part; we discuss the second part in the next 
chapter. 

The theory of a data type is constructed hierarchically from its specification, using 
the theories of the types used in the specification, just like the specification of a data type is 
designed. The design of our specification language has been influenced by the goal that a 
specification should not have to state more than what is required and that it be structured 



-113 



in the sense that different components of the data type behavior are separately specified. 
To construct the theory of a data type from its specification* we combine these components. 
For instance, as is discussed in the previous chapter, an axiom in the axioms component has 
a restricted interpretation: A variable of type 0' in the axiom cannot be freely substituted; 
instead, the substitution should be such that the input to every operation symbol satisfy its 
precondition as specified by the restrictions component, and no operation invocation 
should signal. We first construct the unrestricted axicws from the restricted axioms in the 
axioms component of a specification using the restrictions; these unrestricted axioms are 
used to construct the theory. Henceforth, we refer to a (restricted) axiom in the axioms 
component of a specification as a formula and to an unrestricted axiom as an axiom to 

avoid confusion. 

- ■ ■ - ■ ■ v - ' 

The proposed deductive system is use4 teiPfove properties manually. We have 
not investigated the possibilities of automating the Reductive : system* but we relate our 
work to Mussers work [60,61] on automating the p*oof theory of data types from their 
algebraic specifications. - 

Instead of discussing the complete deductivse system and the construction of a 
theory from a specification specifying non4eterministic t operations and operations 
exhibiting exceptional behavior in a single shot, we d^so step by step., \^e first discuss the 
theory of a data type with deterministic operations and without considering their 
exceptional behavior. We then incorporate the exceptional behavior of data types into 
their theory. Finally, we discuss data types with nondeterministic operations to exhibit the 
extra machinery needed for introducing nondeterminism. 

For specifications specifying only deterministic operations, we discuss various 
subtheories, namely, the equational subtheory, distinguishability subtheory, inductive 
subtheory, constructed using different fragments of the deductive system. We define three 
structural properties of a specification, namely, sufficient completeness, well definedness, 
and completeness. Checking for these properties for a specification is a step towards 
ensuring the correctness of the specification. We precisely state the sufficient completeness 
property defined by Guttag and Horning [28] for a restricted set of specifications and 
extend it to specifications in our specification language. This property requires that the 



-114 



behavior of the observers on their intended inputs can be completely determined from the 
specification by purely equational reasoning. We relate this property to the behavioral 
completeness property discussed in the previous chapter, which is model theoretic and 
which requires that the specification completely specify the behavior of the observers on 
intended inputs. Recall that the behavioral completeness property does not say anything 
about what can be deduced from the specification. We show that sufficient completeness is 
stronger than behavioral completeness. 

The completeness property is even stronger than the sufficient completeness 
property, since in addition to the requirement that the behavior of the observers can be 
deduced on any intended input by equational reasoning, it also requires that the 
equivalence of the observable effect of the constructors on intended inputs can be deduced 
from the specification by equational reasoning. 

The well defmedness property constrains that a specification be modular in the 
sense that it preserve the specifications of defining types and auxiliary types in it. This 
property is stronger than the consistency property. 

In the last section, we define a stronger equivalence on specifications than the 
equivalence defined in Section 3.3. The stronger equivalence of specifications requires that 
not only the two specifications have the same semantics, but their meorfes must also be the 
same. 



115 



4.1 Preliminaries 

A data type can have many different but equivalent specifications (see Section 3.3 
and Section 4.5). These specifications may differ because 
(i) they may specify the properties of constructors to different extents, 
(ii) the properties of the operations are specified in different ways, and 
(iii) they may use different sets of auxiliary functions. 
Theories constructed from different equivalent specifications can be different, as will be 
clear from the following discussion. Unless stated otherwise, we assume that a data type 
has a single fixed specification; in the last section of the chapter, we discuss theories 
constructed from different but equivalent specifications of a data type. 

If a specification S specifies only a single data type D, then the theory constructed 
from S is the theory of D. If S specifies a set of related data types, then the theory 
constructed from S is the theory of the set of related data types. The theory constructed 
from S consists of properties characterizing the behavior of the algebras in F(S), the 
semantics of S. Let Th(S) stand for the theory constructed from S. 

The deductive system uses multi-sorted (or many sorted) first order predicate 
calculus with identity [16] as the underlying logic. Though a first order theory cannot 
completely characterize the 'infinite' models in F(S), we prefer first order logic over second 
order logic because of the following reasons: 

(i) First order logic is well studied, and is better understood than second order logic, 
(ii) most of the programming logics developed for reasoning about the control structures 
of programming languages are first order, 

(iii) the recent work of Cartwright and McCarthy [8] has established that even the 
termination proof, which was believed to employ second order reasoning, can be 
adequately done in first order logic, 

(iv) most of the work in automatic verification uses first order logic as the underlying 
basis, and 

(v) we believe that the most of the interesting properties of programs can be expressed in 
first order logic. 
Multi-sorted logic is more convenient than single-sorted logic as it avoids the use of type 



-116 



predicates, which must be introduced in a single-sorted logic to differentiate among terms 
of different types. We use an induction rule having infinitely many premises which is some 
what unusual; the proofs using this rule are infinitary. We interpret the formulas in Th(S) 
in the algebras in F(S); we do not consider uncountable structures because they are not 
type algebras and so they are of no interest 

As was discussed in the previous chapter, a formula is interpreted in a type 
algebra in the same way as a formula in a structure in Logic [16J, except that the symbol s 
is interpreted as the observable equivalence relation (see the definition in Sections 2.2 and 
2.3) on a domain instead of the identity relation. Because the observable equivalence 
relation is an equivalence relation and is preserved by every function in a type algebra, the 
standard rules for identity hold (i.e., the rules for identity are sound under this 
interpretation). 

We now discuss the structure of formulas expressing properties of the models in 
F(S). Following Enderton [16J, we define the language of Th(S) as the set of nonlogical 
symbols; the nonlogical symbols are used with the logical symbols to construct formulas. * 
Let L(S) stand for the language of Th(S). Instead of defining the complete language of 
Th(S) here, we introduce it incrementally. We discuss here L(S) for a specification neither 
specifying nortdeterministic operations nor the exceptional behavior of the operations. 
L(S) includes the operation symbols of D specified by S as well as the auxiliary function 
symbols used in S. Since Th(S) is constructed using the theories of the defining types and 
the theories of the auxiliary types used in S, LOS) ihdudes L(S*), where S' is a specification 
of a data type D\ for each D' € A U A r 

In Section 4.3 on specifications specifying exceptional behavior of the operations, 
we include the exception names m L(S). In Section 4.4 on speci fications specifying 
nondeterministic operations, L(S) includes additional symbols needed for expressing 



1. A symbol (or an axiom or a rule of inference) is called nonlogical if it is specific to a particular domain of 
discourse whose theory is being constructed. JTiis is in contrast Jp-lo^aj^yn^pis, which arc determined by 
the underlying logic used to develop the theory. For instance, a logical axiom characterizes the logical 
reasoning available in the underlying logic, whereas a nonlogical axiom characterizes a property about the 
domain of discourse. 



-117 



properties about nondeterministic operations. 

Terms of various types can be constructed using the symbols in US) and variables 
of various types as discussed in the previous chapter. An atomic formula is an equation of 
the form 'e i a e 2 \ where e L and e 2 are termsof the same type. Compound formulas are 
constructed from atomic formulas using the standard Jules of construction for first order 
predicate calculus with the help of logical symbols. 

We consider a boolean term as a terra; pther thap an atomic formula; in this 
sense, we adopt a uniform view a^ittl^ symbols a% MS), eonsideiing each as a function 
symbol. This view is especially convenient when we jnc©^p©ple?the exceptional behavior 
of the operations. In case we use, a boolean; term £as a fqrpula* b is considered as the 
abbreviation for the equation '6 = T.' 

Recall that l e [ = if b then e 2 is an abbreviation for l e 1 = If-then-else(6, e r ey and 
'e l = if b then e 2 else e^ stands for the following two conditional equations 
*e x s if b then c 2 ,' 
. *c = if- ft then c r ' 
In the simple case when exceptional behavior is not considered, k e = if b then e .' is 
equivalent to \b = T) => (e 1 == e 2 ).' When we incorporate exceptional behavior, the above 
equivalence does not always hold, because b could possibly signal an exception. However, 
if b is guaranteed not to signal, then the above equivalence holds in that case also. 

We use the abbreviation 'e ^ e 2 for the formula **» (V x r . . . , xj [ e 1 = e 2 ],' 
where x r . . . , x n are the only variables in e l and e r Note that if e l and e 2 are ground terms, 
then 'e l a§ e 2 is equivalent to *~ (e = e^ In fact, it is easy to see that 
(v x x x n )[~ e 1 = e 2 ] =* ( e 1 4 e 2 ). 

Only a subset of Th(S) is useful in reasoning about programs and designs using D. 
This subset consists of formulas in Th(S) expressed using only the operation symbols. 
Formulas expressed using auxiliary functions are not directly useful because the auxiliary 
functions are not available to the users of the data type(s) being specified, but these 
formulas help in proving formulas without auxiliary functions. The correctness criterion 
for implementations with respect to a specification S discussed in the next chapter does not 
require a correct implementation to include implementations of auxiliary functions used in 



118- 



S. Even if an auxiliary function is implemented, it is not available to the users of a data 

type. 

Let L(D) stand for the language of a data type D, which is a subset of L(S) 
consisting only of the operation symbols. L(S) - L(D) is then the set of auxiliary functions 
used in specifications of various data types. Let Th(D) stand for the subset of Th(S) 
consisting of formulas in Th(S) expressed using the nontogical symbols in L(D). We are 
primarily interested in formulas in Th(D). The correctness criterion used in the next 
chapter ensures that TMD) holds for all correct implementations with respect to S. Th(D) 
serves as the interface between programs using B and the correct implementations of D. 
Note that Th(D) does not include those nonlogical axioms of Th(S) which are expressed 
using auxiliary functions. 



-119 



4.2 Theory of Data Types without Nondeterrninisnv and without 
Exceptional Behavior 

We start with the simple case of specifications that do not specify 
nondeterministic operations and the exceptional behavior of the operations. The 
restrictions component of such a specification may specify the nontrivjal preconditions for 
the operations. For illustration, we modify the data type Set-Int so that Choose is 
deterministic; let Set-Int' stand for the modified Set-Int. The.specification of Set-Int' is 
given in Figure 4.1, which is obtained by modifying the specification of Set-Int given in 
Figure 3.1. The syntactic specification of the operation Choose does not have the identifier 
nondeterminisiic. Instead of the required exception condition for Choose on the empty set, 
we specify v #(s) = 0' as its precondition in the restriction component of the specification 
of Set-Int. 

We first discuss how to construct unrestricted nonlogical axioms of Th(S) from 



Figure 4. 1 . Specification of Set-Int' 

Operations 



Null : -» Set-Int' 


OS0 


Insert : Set-Int' X Int -» Set-Int' 




Remove : Set-Int' X kit — Set-tnt' 




Has : Set-Int' X Int -» Bool 


as x 2 € x 


Size '": SeMnV -» Int 


as*(xj 


Choose : Set-hit' -♦. lot 




Restrictions 




P/ftChoosefs)) :: ~ (#(s) = 0) 




Axioms 





1. Remove(0, i) = 

2. Removeflnsertfs, W, 12) =s if 11 = 12 then Remove(s, H) else lnsefWRefnoveCs,i2) I i1) 

3. i€0 s F 

4. i1 € lnsert(s, i2) = if i1 = i2 then T else M €s 

5. #(0) == 

6. # (lnsert(s, i)) = if i € s then » (s) else # (s) + 1 

7. ChoosetsKs s T 



-120- 



the formulas in the axioms component and the preconditions specified in S. We then 
discuss how to construct Th(S) from the nonlogical axioms thus obtained. We do so step 
by step exhibiting the power of various fragments of the deductive system. This will also 
help in investigating how easily these fragments can be automated. We first discuss a 
simple but useful subset of Th(S), called the cquaiional subtheory and written as EQ(S). 
Formulas in EQ(S) are proved using the rules of = and the substitution rule of v. Most of 
the work on developing the proof theory of data types from their algebraic specifications 
has focused on this subtheory [23, 71, 7, 21, 29]. 

We discuss later a richer subtheory, called the distinguistiabiUty subtheory and 
written as DS(S), having inequalities 'e '£ e 2 ' in addition to equations. The inability to 
prove an inequality has been a major limitation Of the recent Works on proof theories based 
on algebra specifications. For instance, both in 2illes*s method as well as in AD J's method, 
two terms e^ and e 2 are unequal, i.e., "e 1 al e 2 ' is provable, if and only if l e 1 s e^ is not in the 
equational subtheory, so the proof of inequality becomes meta. ZllJes J76f recognizes this 
limitation and suggests also using inequalities as axioms. In our deductive system, 
inequalities can be proved from equations by the method of proof by, c^pjpdiction. We 
have this advantage because we view two abstract values (i.e., ground terms) of a data type 
to be distinguishable (so unequal) if and only if a sequence of operations can distinguish 
them. This is in contrast to the view taken by the ADJ group^and ^jl|e^ |||ft two abstract 
values are distinguishable if and only if they are not speciflsJt©%eeq^I: : 

We later include an induction rule which captures the minimality property of a 
data type. This rule is 'infinite' and is derived from the syntactic 1 specifications of die 
operations and the restrictions components of the specification. More properties of a data 
type can be proved using the induction rule than without it. We discuss how the rule is 
used to prove other rules using the nonlogical axioms derived from the specification, which 
simplify the proof of properties of the data type. The subset of equations and inequalities 
provable using the induction rule and the rules p£tfoe disUnguishabJlity su^thepQf is called 
the inductive subtheory and written as 1ND(S). 

We finally construct the full theory Th(S) using the whole machinery of first 
order predicate calculus and the 'infinite' induction rule. We demonstrate the use of Th(S) 



121 



in verifying properties of programs. Every subtheory (as well as the full theory Tb(S)) is 
constructed hierarchically from the corresponding subtheory (or the full theory) 
constructed from the specifications of the defining types and the auxiliary types used in S. 
For instance, IND(S) is constructed from IND(S'), where S' is a specification of 
D'€AUA t . 

In the last subsection, we define suffietettt completeness, Completeness, and weH 
definedness properties of a specification, and relate them to behavioral completeness and 
Consistency properties discussed in Section 3.5. 

4.2.1 Derivation of Nonlogical Axioms 

The unrestricted nonlogical axioms for a specification S can be derived in a 
straightforward way. If S specifies a nontf ivial precbWWten for some operations, then the 
nonlogical axioms are generally co^dttronal eqifatfoite: Let P€ e island for a conjunction 6f 
conditions of the form 'P (e„ . . ■ . , e ) = T for every oa^rrehce of <r having the input 
e ..,€ in e. If an equation 'e, £ e' is m theaxfo^ compbneirt of S, the corresponding 

In ^ ' 1 - 2 

nonlogical axiom of Th(S) is the formula 

(PC, APC e )^{^ar&)t 
For example, the formula 
Choose(s) € s & T 
has an occurrence of the bperatiort Choese, vttiich is spefciffetf to hive the nontrivial 
precondition, so the correspondmg trrrr&tricteti ndntegical akibhi' fe " " 
(~#(s> = OsTi^fehoose(s)l€S^ , rtfP > 

If a formula in mie axioms 5 component ^oes not have ; any operation specified to 
have a nontrivial precondition, then fliefdrmula itself serves 4 astenom , dgreai axiom. For 
example, the formula 7 

#(Inscrt(s,i)) = ifi€sthen #(s)elsf #||)* In /ih; 
itself serves as a nonlogical axiom. 

For any restricted quantifier-free formula^ the corresponding unrestricted 
formula is 'PC =» V, where PC 5 Is a conjunction of the formulas PC g for every term e in 

i 

the formula! 



122- 



4.2.2 Equational Subtheory 

The equational subtheory EQ(S) consists of equations derived from the 
nonlogical axioms of S. An equation ^ as e 2 ' is in EQ(S) if and only if it is provable from 
the nonlogical axioms of S and EQ(S'), where S' is a specification of D', for each 
D' € A u A t , using the four rules of je, namely, 

(i) reflexivity, 

(ii) symmetry, 

(iii) transitivity, 

(iv) substitution property of every function symbol, 

and, 

(v) the substitution rule for the universal quantifier V (i.e., substituting an appropriate 
term for every occurrence of a free variable, in a nonlogical axiom X 
All five of die above rules are not necessary; some of them can be derived from the others 
[16]. As an illustration, we give a proof of the equation #(lnsert4lnsert(Nttll, i), i)) = 1/ in 
Figure 4.2. 

£Q(S) defines a relation on ground terms of different types; let EQy stand for 
this relation on ground terms of type D'. For any ground terms e . and js^ <% v ,e 2 > € EQ D if 
and only if t e 1 = c 2 ' € EQ(S). 

If the nonlogical axioms are equations (possibly using if-thcn-ekc functions), they 
can be considered as unidirectional rewrite rules by defining an appropriate ordering on 
terms. If a decision procedure for EQ{§) exerts (i.e., the relation EQ Q i for each 
D' € A U A t u {D} is decidableX then it is often pcwsibte to generate a convergent set of 
rewrite rules from the nonlogical axioms using the KnuA-B^ndlx algorithm [44], which 

Figure 4.2. Proofof^dnscrtOnserKNuli^ftriiP ' 

1. i € lnscrt(Null, i) = T Substitution in Axiom 4 of Sct-Int' and the theorem of Int. 

2. #(lnscrKlnscrt(Nult, i), i» = #( Insert^ aN.t)) Step ? l,&ik&tiiii©h m Axiom $ of Sct-Int' 

3. = #(Null) + 1 Axiom Xsubstiua^ in «xiom 6 of Sct-Int', and transitivity. 

4. =0 + 1 Axiom56fSei-InL 

5. = 1 Theorem of Int 



123- 



constitutes the decision procedure for EQ(S). The AFFIRM system [60] is designed in part 
around this result. Though nonlogical axioms using if-then-clse functions have been 
studied [60, 21, 5], there appears to be some difficulties in using the Knuth-Bendix 
algorithm on them [61]. 

For automating the process of proving properties from the nonlogical axioms of S 
using the above five rules, it may be helpful to view a formula of the form 

PC =* (e x = e 2 ), 
where PC is a conjunction 'b = T A . . . A b n = T" as the formula 
e, = if b, A ... A b then e,, 

11 n 2 

as the two formulas are equivalent and the second formula can be considered as a rewrite 
rule. For example, 

(~ #( s ) = = T) => Choosc(s) € s = T 
can be viewed as 

Chooses) € s = if ~ #(s) = then T. 

4.2.3 Distingu.shability Subtheory 

Tlie distinguishability subtheory DS(S) is richer than EQ(S); it has two kinds of 
formulas: (i) *e = e r " and (ii) 'e £ e y ' Our approach for proving inequalities is simple; it is 
based on the definition of distinguishability discussed in Sections 2.2 and 2.3. The 
distinguishability theory of Bool serves as the basis; since T d F is a formula in the 
specification of Bool, *T £ F e DS(Bool). (Recall that only the specification of Bool 
includes an inequality as an axiom.) T d F obviously holds in every model of Bool. This 
inequality is used to prove inequalities of terms of type D by reductio adabsurdum (proof 
by contradiction); this is the sixth logical rule, besides the five rules discussed in the 
previous subsection, which is used to construct the subtheory DS(S). We of course use 
inequalities in DS(S'), where S' is a specification of D' € A U A r 
Given two terms e and e r we prove t e i a£ e 2 ' as follows: 

We assume on the contrary that > e 1 = c 2 ;' 

we then derive 'ej = e' 2 \ where *ej £ e'^ is already provable, i.e„ either 
' e[ £ e' 2 ' G DS(S'), or ' e[ £ e' 2 ' € DS(S). 



-124 



We illustrate the above rule to prove the inequality 'Null A 1hscrt(s, IV in Figure 4.3. For 
any ground terms e 1 and e r the formula 'e l d e 2 ' interprets in a model in F(S) to whether 
the interpretation of e 1 is distinguishable from the interpretation of e Y 

The method of proof by contradiction can be integrated into a rewrite rules 
system like AFFIRM. If an inequality l e l -k e^ is to be proved, we assume , e l = e^ as an 
axiom and add it to the set of nonlogical axioms. We get the rewrite rules corresponding 
to the new set of axioms and run them to check whether a contradiction, i.e., one of the 
rules T~»F and 'F-*T' or *-«J-* e' r * is generated, where the inequality ' e[ d e' 2 ' is already . 
proved. 

4.2.4 Inductive Subtheory 

The subtheory DS(S) is still not rich enough because there are many useful 
equational formulas which hold for every data type io-^S), but cannot be proved using the 
logical rules of DS(S). For example, the equation 

Has(Remove(s, i), i) = F 
cannot be proved because 

(i) there is no nonlogical axiom directly expressing the behavior of Has on a set argument 
having the structure Removes, 0, and 

(ii) Removes, i) is not equivalent to Null or an expression of me form Insert(s\ P) unless 
some conditions are placed on s. 

But, Has(Remove<s, i), i) = F holds in every model in F(Set-Iiit'). Even if we use the 
whole deductive system of first order predicate calculus, this formula cannot be proved 
from the nonlogical axioms of SeMnf. 

Figure 4.3. Proof of Null ^ Inserts, 

To prove Null ± Insert (s, i) 

assume Null = Insert (s, i) 

Has (Null, i) = Has(lnscrt(s, i), i), substitution property of Has 

FsT, thcaxkjnBS3and4ofSet-Int', 

which is a contradiction. 

so Null ^ InsciKs, i) € DSCSct-Inf). 



125 



The above limitation is due to the fact that the minimality property of data types, 
which is captured in the definition of a type algebra, is neither captured in the underlying 
logic nor expressed as a nonlogical axiom (see the discussion of the minimality property in 
Section 2.1). We discuss below an induction rule which captures this property. The rule 
can be constructed from the syntactic specifications of the operations in S. We compare 
our rule with other similar rules proposed in the literature, and demonstrate the inadequacy 
of some of these rules. We discuss how the 'infinite' rule can be used in proofs. For better 
exposition, we first assume that no constructor of D is specified to have a nontrivial 
precondition by S; we later relax this restriction. 

4.2.4.1 Infinite Induction Rule 

Def. 4.1 A ground term e is called a constructor ground term if e is expressed only using 
constructor symbols. I 

(f) Induction Rule 

Given a formula <b(x) with a free variable x of type D. 

For every constructor ground term e of type D, Q[x/e] f- (V x) $(x). 

The above inference rule is infinitary, as there are usually infinitely many constructor 
ground terms of type D and so, the rule requires infinitely many premises. The notion of a 
proof is infinitary whenever the induction rule is used. Intuitively, the above rule states 
that if a formula <&(*) holds in every case when a value of type D is substituted for x, then 
we can deduce the formula '(V x) oCx).' It is easy to see that the above rule is sound 
because every type algebra by definition has the minimality property, which states that 
every value of D is represented by some constructor ground term of type D. It is sufficient 
to consider only constructor ground terms because these represent every value in a type 
algebra. 

Burstall and Goguen [7] also realized the limitation of the proof theory based on 



-126 



the rules of =. 2 They introduced the induce operator on theories; the induced theory is 
equivalent to the original theory with the above induction rule. The above induction rule is 
a generalization of the structural induction rule of Burstall [6]. The structural induction 
rule is based on identifying a minimal set of constructors (instead.of all constructors) which 
generates the values of D and has the property that every finite sequence of constructors in 
the subset generates a distinguishable value. To our knowledge, Wegbreit and Spitzen [72] 
were the first to generalize the structural induction rule, but they presented it informally. 
The data induction rule of Guttag et al. [29] is the same as the induction rule of Wegbreit 
and Spitzen. Recently, Musser [61] has suggested a formalization similar to our 
formulation of the rule. 

4.2.4.2 Rationale for an Infinite Induction Rule 

Below, we discuss the rationale for using an infinite rule to capture the 
minimality property of a data type. We demonstrate the inadequacy of an irtifoetibii 
scheme seeminglj suggested by Wegbreit and Spkaetf |72}, Guttag etal. [29|, and Nakajrma 
et al. [62]. For illustrationy w use a sto^ 

denoted by N 2 . N 2 has four operations: 0, the constant zero; S, the successor operation; P, 
the predecessor operation; and, =» the equality operation. Its specification is given in 
Figure 4.4. The constructor^ is derived in the sense that the values returned by P can be 
constructed using and S. We would like to prove frpm, the nonlogical axioms of N 2 and 
the induction rule, the following normal form lemma in the fuU theory: 

0) (VJc)[x*0v(3^[xsSC^Jl 
In general, we would like to have in Th(N2) the scheme 

(2) (*(0) A(v x)[4<^=>*CS(x))])=»(v4 *(& 
where * is a first order formula with at least one free variable. 

If we express the minimality property of N 2 with the following scheme: 

(3) (4*0) A (v x) [ Hx) =* (*<P(x)) A «KS(x))] ) -» (V x) *<x), 



2. However, ADJ [71] do not seem to agree that properties provable using the induction rule are relevant 



127- 



Figure 4.4. Specification of Data Type N2 


Operations 


: -+N 2 


s 
p 


N 2 -»N 2 
N 2 -»N 2 
: N 2 X N 2 -» Bool 


Axioms 


P(0) a 
P(S(x» = x 


x = x = T 


x = y 

S(x) 

SU) 


s y = x 
= = F 
= S(y) s x = y 



where is a first order formula, we can neither prove (1), nor (2). This is because there are 
nonstandard models of the nonlqgical axioms given in Figure, <fj4 andthe scheme (3), in 
which the scheme of formulas (2) and/or the formt:laj(l) ldo r $Pt ,h^. Figure 4.5 is one 
such model in which the nonlpgical axioms as well asjne scheme (|)| holds but the formula 
scheme (2) does not hold. The model has an infinite chajn gqing frpm a constant symbol c 
in both directions in addition to the chain of ria^ural purribejs, and there, is ^ unary 
predicate symbol M whose interpretation in the modieUs ^ejr^eflieate which i^ false c^aU 
constants on the negative side of c, and true otherwise. The figure shows the values in the 
models on which the interpretation of Mis false. 

The scheme (3) does not capture the property that the operation P when applied 
on any natural number will hit in finitely many stfe^'e'ither O'drt number that behaves like 
0(in nonstandard models). This property is needM^deriVe(l)or^l). 

It should be obvious that the scheme (2)4 well as the fbrmirk(l) hold in every 
model m F(N 2 ). Formulas of the kind (2) ahdfthe formula (\) are very useful in proving 
properties of programs using N 2 . For example^ using the tofrriuta scheme (2), the proof by 
induction amounts to checking for the basis condition and a single case in the inductive 
step, where as (3) requires two cases in the inductive step. 

We would likd the induction rule to be constructibie from the syntactic 



128 



Figure 4.5. A Nonstandard Model or (he Axioms In N with the Scheme (3) 



S S S 
->-- 1 ->- 2 ->~ 3 ->- 4 ->- 
-<- -<- -<-... 
P P P 

s s s s s 

->-■ c-2 -->-- H ->•• c ~>~ c+ 1 ~>~ c+2 . . . 
<- <- <• <- <- 

P P P P P 

F F 



specification so that the rule does not have to be stated explicitly for every data type in its 
specification. In addition, the induction rule should be strong enough so that, for example, 
the formula scheme like (2) and the normal form theorem (1) can be derived in case of N2. 
The above discussion shows that the scheme (3) is not powerful enough. However, the 
infinite induction rule (t) for N2 does the job. It can be shown that the scheme (2) and the 
formula (1) are derivable from that rule. 

Another alternative for characterizing the minimality property is to use 
multisorted second order predicate calculus as the underlying logic and express the 
minimality property as a second order formula. But, this approach is not attractive because 
of the reasons discussed in the first section. 

4.2.4.3 Use of the Induction Rule 

For using the induction rule (f), we must establish infinitely many premises. This 
can be done by imposing a partial ordering on the set of constructor ground terms and 
using induction on ground terms. We discuss below a technique for doing this. We start 
with an instantiation of this technique which uses the structure of the ground terras; tins 
method is known as the structural induction [6}, We show that 

(i) for each basic constructor a : D 2 x . . ..x D -» D, which does not take any argument 
of type D, Q{x/o(e v .... ej\ is provable, and 

(ii) for every other constructor a € 0, Q{x/o(e v ..., cjft is provable assuming 4{x/e} for 



129- 



every D. = D. 

However, there are situations when the structural induction is not useful or convenient; 

instead, a different partial ordering on ground terras* is preferable. 

We present below a generalized technique. Let G stand for the set of all 
constructor ground terms of type D. We can define an ordering relation (non-reflexive, 
antisymmetric, and transitive) < on G such that (G, <) satisfies the minimum condition. 
Defining < on G gives a generalized (Noetherian) induction rule flO] on G. 

Def. 4.2 (G, <) satisfies the minimum condition iff for every nonempty subset A of G, A has 
a minimal element with tespect to <. I 

Generalized Induction Rule: 

If for every e € G such that for every element e' € G that is < e, $[x/e'], => Q[x/e], 

then (v e € G) Q[x/e]. 

So, in order to establish the infinitely many premises of the 'infinite' induction rule (t), we 
define a partial ordering < on the constructor ground terms in G such that (G, <) has the 
minimum condition and use the generalized induction rule. 

Using the nonlogical axioms of S, one can identify a subset C ofG such that for 
every constructor ground term e € G, there is a ground term e' in C such that 
'e = e" € EQ(S). We can then simplify the induction rule using the following rule of first 
order predicate calculus: 

(e = e) \- *[x/e] «* +fx/Cl 
We need to show only mat for every ground term e € C, 4{x/e\. For example, it can be 
shown in case of Bool, that for every boolean ground term e, either 'e = T € EQ(BooI) or 
'e= F* € EQ(BooI). So to prove a property having a free variable of type Bool by 
induction, it suffices to show that the property holds in case of T and F. 

Let us consider the example of Set-Inf . The induction rule (f) for Set-Int' is: 



3. The property of a set A satisfying the minimum condition with respect to an ordering relation < is related 
to the well foundedness property of A with respect to <. It can be shown that A is weH founded with respect to 
< if and only if (A, <) satisfies the minimum condition. 



-130 



For every constructor ground term e of type Set-Int', *[x/e] h- (V x) ♦(*). 
The following theorem establishes that the constructor Remove is derived in the sense that 
it does not construct any value of Set-tut' distinguishable from the values constructed by 
Null and Insert. 

Thin. 4.1 Every constructor ground term g.of type Set-kit' is equivalent by equational 
reasoning to a ground term e" not haying any occurrence pf Remove, Le. v the equation 
'ese"€EQ(Set-Int'). 

Proof Using induction on the number of Remove (and subsequently the number of Insert) 
in a constructor ground term, we show the above with the help of the axioms 1 and 2 of 
Set-Int. For details, see Appendix III. I 

Using this theorem, we get a simpler induction rule for Set-Int': 

(4) For every constructor ground term e of type Set-Int' having only the occurrences of 
Null and Insert, <b[x/e\ h- (v x) *(x). 

We can define an ordering generated by the following relation on ground terms 
constructed using Null and Insert. 

Null < Insert(x, /), and *< Insejt(x, 
for any constructor ground term x and integer constructor ground term /. Using the 
induction rule (4), we can prove for any formula ♦, 

(5) ( *[x/Null] A (v x) [ *(x) => (v i) *0nsert(x, I)) ] ) =» (* x) *<x). 
We also get the following normal form theorem for Set-Int' using (5) 

(v s) [ s =? NuIK) V (3 s\ i') s = Inserts', i') \ , 
Note that the above formula is different from Theorem 4.1. (The above formula is not in 
IND(S) because of the use of the existential quantifier 3 in it, but it is in TH§) as discussed 
later.) Theorem 4.1 cannot be expressed in fir^ order predicate calculus. Using the 
scheme (5) and the nonlogical axioms of Set-Int', we prove Ha.s(Remove(s, i), i) = F in 
Figure 4.6. Recall that this formula could not be proved in DS(SeHnt'). 

The inductive subtheory IND(S) consists of equations and inequalities, and is 
defined to be the set of formulas derived from the nonlogical axioms using the six rules 
discussed in the last subsection (meaning DS(S) Q ^ND(S)y and the infinite induction rule 



131- 



Figure4.6. Proof of l Has(Remove<s, i), i) = F 

Wc use the formula scheme (5) above. 

liasis: Has(Rcmove{Nu11, i), i) s Has(Null, i) s F Axioms 1,3. 

Inductive Step Assume Has(Removc(s, ij, i) s P, 
to show (V iV)l Has(R«mo«c(lnscrt(s,n), t), i) * F 1 

Case /: i = il 
Has(Rcmovc(Inscrt(s, il), i), i) = Has(Rcmovc(s, i), i) & F, Axiom 2, and the assumption. 

Case 2: ~ (i = il) 
Has(Rcmovc(Insert(s, il), i), i) = Has(Inscrt(Remove(s, i), il), i) Axiom 2. 

= Has(Rcmovc(s, i), i) = F Axiom 4 and the assumption. 

Using the scheme (5), wc get Has(Remove(s, i), i) s F. 



(f). We later discuss the conditions under which formulas in IND(S) can be proved using 
the Knuth-Bendix algorithm (Subsection 4.2.7). 

4.2.4.4 Specifications with Nontrivial Preconditions for Constructors 

The induction rule (f) is also applicable to specification^speeifying nontrivial 
preconditions for the constructors as it captures a general property 6f data* types and rjrjfca'^ 
property of specifications. It can be simplified depending on the semantics used for a 
constructor a on inputs not satisfying its precondition. 

If nontrivial preconditions are specified' for constructors, we are interested in 
constructor ground terms in which the input to every constructor invocation satisfies 5 "the-" 
specified precondition. This is so because a constructor is not likely to »be invoked 4i4th ; an 
input not satisfying the specified precondition. Even if the constructor is invoked on such 
an input, we are not interested in its behavior. 

Def. 4.3 A constructor ground term e is called legal if and only if (i) e does not have any 
occurrence of an auxiliary function, and (ii) for every subterm of e of the form 
e \ = °^ e ir • • • • e in^' wnere a ' s a constructor, 'P ff (e ir . • • , e^) = T € EXJ(S), I 

The restriction that 'P (e n , ..., e ln ) = T' € EQ(S) is for convenience; we could have 
required the formula to be in Th(S), the full theory constructed from S. (Recall that YJLX) 



132 



is a boolean term without involving any quantifier;) We are mostly interested in formulas 
involving legal ground terms. 

Assuming the semantics used in Chapter 3 (i.e., on an input not satisfying its 
precondition, a returns a value of D constructive toy the constructors of & using inputs 



Figure 4.7. Specification of Stk-Int 

Stk -Int as Stk 

Operations 

Null : -> Stk 

Posh : Stk X lot -» Stk 

-» overftowfStk, Int) 
Pop : Stk -» Stk 

Top : Stk -» Int 

-♦ no-topO 
Replace : Stk X Int -+ Stk 
Empty : Stk -» Bool 

Auxiliary Functions 

Size : Stk -» Int as *(x) 

Restrictions 

/VdPop(s)) u ~Empty<s) 
/VdReplaceU, 0) :: ~Empty(s) 

Empty(s) => Top(s) signals no-topO 

Push(s, signals overflowfs, i) =*■ #(s) > 100 

Axioms 

1 . Pop(Push(s, 0) s s 

2. Top(Push(s, 0) s I 

3. Replace(s, i) = Push(Pop(s), I) 

4. Empty (Null) = T 

5. Empty (Push($, I)) s F 

6. #(Null) s 

7. #(Push(s,i)) = #{•) ♦ 1 



133- 



satisfying their preconditions ), the induction rule (f) gets simplified to 

for every legal constructor ground term e of type D, $[x/e] \- (V x) $(*)• 
This is so because every constructor ground term that is not legal is equivalent to some legal 
constructor ground term by the above assumption. 

If the above assumption about the behavior of a is dropped and nothing is 
assumed about its behavior on inputs not satisfying the preconditions, then we have 
for every legal constructor ground term e of type D, <&[x/e] f- 
(vx)( V (3 1 x )[x=o(x. x. )AP (X X ) = T]) • 

1 = 1, m 1 n. 1 n. ll n. 

where { a ,,..., a } is the set of constructors of D. The condition in the matrix of the 

*- 1 m ' 

consequence of the above rule ensures that x ranges over values serving as the 
interpretations of the legal ground terms of D. This is the strongest consequence we can 
have because the interpretation of illegal constructor ground terms is not known. For 
example, if we drop the restrictions in the specification of Stk-Int repeated in Figure 4.7 
specifying the exceptional behavior of the operations, the modified specification associates 
preconditions with the constructors Pop and Replace. The induction rule would then be 
for every legal constructor ground term e of type Stk-Int, <b[s/e] \- 
(V s) ( s = Null() V (3 s\ s = Push(s\ V (3 s') [ ~ Empty(s') = T A s = Pop(s') ] 
V (3 s\ [ ~ Empty(s) = T A s = Replace(s', i') ] ) => <Ks). 

We have discussed in Chapter 3 the reasons for assuming that a constructor a on 
an input not satisfying its precondition can either signal an exception or return a value 
constructible by the constructors using inputs satisfying their preconditions. An additional 
reason for this assumption is that otherwise the induction rule gets complex, as should be 
evident from the above discussion. 



4. a can also signal on such an input; since wc arc considering data types without exceptional behavior, this 
choice is ruled out 



-134 



4.2.5 The Full Theory 

In proving properties of programs, one often uses properties of data types other 
than equations and inequalities. For example, we often, need to .prove properties of the 
form \e u - e u ^ • • • A e Xli s e 2n ) ■ =* (/| =J^)«' Or, we may need a formula involving 
existential quantifiers. For example, consider the union procedure on sets of integers 
written in a CLU-like language and given in Figure 4.8. The integer variable i inside the 
loop defines the range (-i + 1, i-1) of integers which have been checked to be members of 
the first argument and if so, have been inserted into the result being computed. The 
variable i is incremented every time the loop is executed. To prove the termination of 
union, we need to show that a set is either empty or there is an integer k such that every 
element of the set lies in the range (-k, k). The following formula expresses this property: 

(6) (vs)fs = NullV(3k)(vj)[IIas(s,J) = T=»(j<kAj>-k)]] 
To prove such properties, we need the whole machinery of first order predicate calculus 
with identity. The proof of (6) is given in Figure 4.9. 

The full theory Th(S) is the set of formulas derivable from the nonlogical axioms 
of S and Th(S'), where S' is a specification of a defining type or an auxiliary type used in S, 
using the logical axioms and rules of inference of multi-sorted first order predicate calculus 

Figure 48. Procedure Union - 1 

union = proc(sl, s2 : Sct-Int") returns (Sct-Inf) 
i:lnt:=0 
rl:Sct-Int':=sl 
r2;Set-Int*:s82 
while ~ Sct-lnt'$Sizc(rl) = Odo 

if Sct-Inf$Has(rU) then rl := Sct-lnt$Rcmove(rl, i) 

r2:= Sct-lnt'$lnscrt(r2,i) 

end 

if Sct-lnt'$Has(rl, -i) then rl : = Sct-lnt'$Rcmovc(rl, -i) 

r2 : = Sct-lnt'$lnscrt(r2, -i) 

end 

i:=i+l 
end 

return (r2) 
end union 



135 



Figure 4.9. Proof of the Formula (6) 

To prove (V s) [ s = Null() V (3 i) (V j) [ Has(s, j) s T =* ( j < i A j > -i )] J 

Using the scheme (5), 

4Hs)= [s=Nul1V(3i)(Vj)[Has(s,j) = T=>(j^iAj^-i)J] 
Basis *(Null)<=>T 

Inductive Step Assume $(s), to show ( V k) $(Insert(s, k)) 
Since $(s)'WT, we have two cases, 
Casel ssNuBO 

*(Insert(Null, k)) <=> T, because i is |k|, the absolute of k 
Case 2 (3 i) (Vj)[Has(s,j>sT=» (j < i Aj > -i)] 
Subcase I - i < k < i, 

i itself serves to prove that *(lnsert(s, k)) «=» T from *(s) 
Subcase 2 k>iVk<-i m ^ . ...... 

|k| serves as i to prove that $( Inserts, k)) «=> T from *(s). 

Using the scheme (5), we have (V s) $(s). 



with identity, as well as the infinitary induction rule (f). 

The following diagram summarizes the relationships among different subtheories 
and the full theory: 

Th(S) First Order Predicate Calculus + Infinite Induction Rule 
U 
IND(S) •+ Infinite Induction Rufe 

g ■-,..,;■,-. 

DSOS) + Proof by C^tradiction 

u 
EQ(S) Four Rules of =f and the Substitution Rule of v 

The following theorem shows that the above deductive system is sound. 



-136 



Thm. 4.2 For any two ground terms e l and e r 

(i) if '^ s e 2 ' € Th(S), then e and e 2 are observably equivalent by S (i.e., observably 
equivalent in the models in F(S)), and 

(ii) if t e l i e 2 ' € Th(S), then e x arid e 2 are distinguishable by S. 

Proof The theorem follows from the facts that (a) the nonlogical axioms hold in the 
models in f(S) with = interpreted as the observable equivalence relation, (b) the 
observable equivalence relations are preserved by the functions in the models in F(S). I 

4.2.6 Properties of a Specification 

We can define properties desirable of a specification by requiring that various 
subtheories and the full theory derived from tlie specification satisfy certain conditions. 
Guttag and Horning [28J have discussed the sufficient completeness property for a 
restricted class of specifications, which has been found useful. We state that property in 
our framework. We extend it to specifications using auxiliary functions and specifying 
preconditions for the operations. The sufficient completeness property captures the 
intuitive notion that the behavior of the observers s cbntpietely specified on intended 
inputs and that the result of an observer on an intended input can be deduced by 
equational reasoning. We relate this property to the behavioral completeness property 
defined in the previous chapter and show that sufficient completeness is stronger than 
behavioral completeness (Theorem 4.4) because behavioral completeness only requires that 
the behavior of the observers be completely specified on intended inputs and it does not 
say anything about what can be deduced from the specification. 

When specifications are used to prove properties of programs using the data types 
being specified, we often need to relate different constructor sequences. In that case, it is 
desirable to have a specification satisfy a stronger property than sufficient completeness, 
which in addition to the requirement that the behavior of the observers can be deduced by 
equational reasoning on any intended input, also requires that the equivalence of the 
observable effect of different constructors can be deduced by equational reasoning. We 
call this property the completeness property of a specification and define it precisely. We 



-137 



later see that for a complete and consistent specification S, formulas in IND(S) can be 
proved using the Knuth-Bendix algorithm (see Subsection 4.2.7). 

Recall from Section 3.5 that for a consistent and behaviorally complete 
specification S, the models in F(S) are behaviorally equivalent w.r.t. { P ff | a € $2 }. 
Furthermore, if S does not specify any nontrivial precondition for the operations, the 
semantics of a specification S is a single data type, a set of behaviorally equivalent algebras. 
In that case, for any two ground terms of type D, they are either observably equivalent by S 
or distinguishable by S An obvious question is whether the proposed deductive system is 
powerful enough to deduce this from a consistent and behaviorally complete specification. 
We show that it is not the case. But if a specification is consistent and complete, then the 
deductive system has this property. 

Since S is hierarchical, S should preserve the specifications of the types used in S. 
S should only specify the behavior of the operations of D, and it should not specify the 
behavior of a type D' used in S that is not captured by its specification S'. Specifications so 
designed are modularly structured; they support the factoring and hierarchical structuring 
of the proof of correctness of a hierarchically designed implementation. We define the well 
definedness property of a specification which captures this modularity requirement 

Before we discuss these properties, we prove 

Thm. 4.3 For a consistent S, for any two ground terms e l and e 2 of the same type, both i e l 
= e 2 ' and 'e £ e 2 cannot be in Th(S). 

Proor If S is consistent, then f(S) s£ 0. 

Suppose for some ^ and e y both 'e = e 2 and *e £ e 2 are in Th(S). 'e 1 = e 2 € Th(S) 
implies that e and e 2 are observably equivalent by S. Similarly, 'e l ai e 2 € Th(S) implies 
that e Y and e 2 are distinguishable by S, which is a contradiction. I 



138 



4.2.6.1 Sufficient Completeness 

As was said earlier for constructors, for a specification specifying nontrivial 
preconditions for the operations, one is interested in ground terms in which the input to 
every occurrence of an operation symbol satisfies the associated precondition. This is so 
because an operation is not likely to be invoked with an input .not satisfying the specified 
precondition. Even if the operation is invoiced on such an input, we are interested uvits 
behavior. Furthermore, if a specification uses auxiliary functions, ground terms in which 
auxiliary functions appear are also not of interest because they are not used in programs 
using the data type. Earlier we defined a legal constructor ground term (Def. 4.3); below, 
we extend the definition to a ground term. 

Dcf. 4.4 A ground term e is called legal if and only if (t) e does not have any occurrence of 

an auxiliary function, and (ii) for every subterm of e of the form e = e (e u e ln ), 

where € Q, 'P ff (e n e ln ) s T € EQ(S). I 

For a specification using auxiliary functions and specifying nontrivial preconditions, only 
legal ground terms are interesting. If such a specification |s consistent and behaviorally 
complete, any two legal ground terms are either observably equivalent by S or 
distinguishable by S (see Section 3.5). 

In [28], Guttag and Homing define the sufficient completeness property of 
specifications which do not specify a nontrivial precondfion'fot the operations and do not 
use auxiliary functions. We state their definition in our framework. 

Def. 4.5 A specification S is sufficiently complete if and only if for every ground term eof 
type D' e A, there exists * theorem derivable from S of the form l em€ ', where e" is a 
ground term of type D' without any occurrence of an dperation symbol of D. I 

In [28], the deductive system to be used to derive a theorem is not specified. Guttag [33] 
requires that the equation ' e s e" be in the equational subtheory EQ(S). 

The sufficient completeness property can be extended to specifications using 
auxiliary functions and specifying nontrivial preconditions for the operations. For auxiliary 
functions, there are two possible extensions: 



139- 



(i) Consider only the ground terms expressed using the operation symbols, because only 
these terms can be used in a program, or 

(ii) consider all ground terms, thus requiring that auxiliary functions also be completely 
specified 

We take the former approach; however, we recommend that whenever an auxiliary 
function is used, it be completely specified. 

Def. 4.6 A specification is sufficiently complete if and only if for every legal ground term e 
of type D' € A, a formula ' e = e" € EQ(S), where e' is a legal ground term of type D' 
without having any operation symbol of D or any auxiliary function. I 

For example, the specification of Set-Int' is not sufficiently complete, because for instance, 
a legal ground term Choosc(Inscrt(lnsert(Null, 1), 2)) cannot be related to any ground term 
of type Int that does not have any occurrence of an operation symbol of Set-lnt'. 

The following theorem relates sufficient completeness to behavioral 
completeness. The intuition behind this result is that if the behavior of observers on 
intended inputs can be deduced hy equational reasoning from S, tiien the observers must 
be completely specified by S. 

Thm. 4.4 If a specification $ is sufficiently complete, then S is behaviorally complete. 

Proof:- See Appendix HI. I 

The converse of the above theorem however does not hold. So, the sufficient 
completeness property is strictly stronger than behavioral completeness, as there are 
specifications which are behaviorally complete but are not sufficiently complete. This is so 
because in the definition of sufficient completeness, only a fragment of the deductive 
system of first order predicate calculus is used to derive properties from the specification. 
There can exist a legal ground term e of type B' € Asiichtrtat we cannot derive * e a e 1 ■ for 
any e' of type D' not having any occurrence of an operation symbol of ty,'m the equational 
subtheory EQ(S). However, we can derive the above equation in Th(S) using other rules in 
addition to the rules of the equational subtheory. We Illustrate this point using the 
specification of Set-IttV We add another axiom defining Choos* oh sets of size > I as 



140- 



returning the maximum integer in the set 

8. Choose(Insert(lnsert(s, il), i2)) = if Size(s) = then (if- il = 12 then Maxfil, i2)) 

else (if ~ il = i2 then Ma x(Cboose( Inserts, il)), 12) else Cho©se(insei1(s, it))). 

The modified specification is not sufficiently complete, because Choose (lnsert(Nu!l, i)) is 
not directly specified. Nor can we deduce by equational reasoning that 
'Choose(Insert(Null, i)) = i.' However, using the theorem of Hifc ' (i = j m T) «*> i a j ' 
derived using the induction rule for integers, the axioms 3, 4, and 7 of Set-Int', and case 
analysis, we can prove by contradiction that 

Choose(Inscrl(Nuli, i)) = i 
It should be obvious that with a minor modification of the proof of Theorem 4.4, we can 
prove the following generalization of Theorem 4.4: 

Thm. 4.5 If for every legal ground term e of type D' € A, there exists a ground term ef of 
type D' not having any operation symbol of D and auxiliary function such that ' e s £" € 
Th(S), then S is behaviorally complete. I 

Theorem 4.4 can be derived as a corollary of the above theorem. We conjecture that the 
converse of the above theorem is also true, which says that the deductive system is 
complete with respect to deducing the behavior of an observer on an intended input 

Conjecture 4.1 If S is behaviorally complete, then for every legal ground term e of type D' 
€ A, there exists a ground term d of type D' not having any operation symbol and auxiliary 
function such that * e m d '€ Th(S). 

We can prove the following partial completeness result about the deductive 
system in proving the distinguishability of legal ground terms of type D', D' € A U { D }. 

Thm. 4.6 For a consistent and sufficiently complete S, if any two legal ground terms e t asd 
e 2 of type D are distinguishable by S, then 'Cj 4 c 2 ' € 1^S(S^. 

Proof See Appendix III. I 

If conjecture 4.1 is true, then we can prove a similar result about behaviorally complete 
specifications: For a consistent and behaviorally complete specification S, if any two legal 



- 141 - 

ground terms e } and e 2 of type D are distinguishable by S, then 'e 1 d e 2 € Th(S). 

4.2.6.2 Completeness 

We cannot prove a similar result about the observable equivalence of legal 
ground terms of type D, because we do not have a rule analogous to proof by contradiction 
in the deductive system that enables us to prove the observable equivalence of ground 
terms unless explicitly specified by the nonlqgical axioms. Different but equivalent 
specifications of the same data type can differ in the extent to which the observable 
equivalence relation of legal ground terms of D can be proved ftom the ndnlogieal axioms; 
For example, the terms !nsert(Insert(Null, 2), 2) and InsertflVull, 2) are observably 
equivalent by Sct-Inf, but IhserKlnscrtfNiril, 2), 2) = 1nsert(r«, 2)' t Th(SeMnt'). If we 
add the following axiom to the specification of SeHnf: 

9. Inserl(Insertfs, II), 12) = if \\ = i2 then InseHCs,iireis* !nsert(Inscrt(s, 12), n), 
then ifiserl(ImerHNuH 2), 2) = insert(NliH, 2)' € EQC^f-Int'). The semantics of the 
modified specification is the same as the semantics of the original specification of Set-Int'. 
The more a specification of D captures the observable equivalence relation on terms of type 
D, the more useful it is in deriving the theory of D antfliertce in proving properties of 
programs using D. We define below a property of a sp^ficatiqn rehiring it to completely 
specify the observable equivalence relation, We put a stronger requirement: We want 
EQ(S), instead of Th(S) r te have a formula ^js^' for two IfgaUrpund terms e v e 2 if and 
only if e l and e 2 are observably equivalent by S, so that such formulas can be derived by 
purely equational reasoning (i.e., using the rules of s and the substitution rule for V). 

Def, 4.7 A sufficiently complete specification S is complete if and only if assuming that the 
specification S' of each D' € A u A t is complete, for any two legal grounds terms e x and e 2 
of the same type, 'e x = e^ € EQ(S) if and only if e^and e 2 are observably equivalent by S. 
I 

The completeness property of a specification should not be confused with the completeness 
property of a theory of an algebraic structure as defined in Logic [16]. Using Theorems 4.4 
and 4.6, and the fact that for a consistent and behaviorally complete specification, any two 



142 



legal ground terms are either observably equivalent or distinguishable by S, we have 

Thm. 4.7 For a consistent and complete specification S, for any legal ground terms e and 
e 2 of the same type, either i e 1 = e 2 € DS(S) or l e x £ e 2 € DS(S). I 

Musser [61] has called a specification from which either ''e = e 2 or 'e d e 2 can be 
derived in DS(S) to be fully specified, though his view of a specification is somewhat 
different. He views the operator *=' as another Operation of a data type, whereas we 
consider '=' as a predicate in the underlying logic used to construct formulas. 

4.2.6.3 Well Def inedness 

We would like a specification S to be modular, i.e., for the specification S' of each 
D'CAUA,, Th(S) Ij^g.) = Th(S'). This means that Th(S) does not have a formula 
expressed using symbols in L(S) that is not in TI^S'). Only those properties which involve 
an operation symbol of D and/or auxiliary functions used in S can be proved from S; a 
formula not having any operation symbol of D or an auxiliary function in S and not in 
Th(S') cannot be proved from S. 

For a consistent and sufficiently complete specification, the following holds: 

Thm. 4.8 For a consistent and sufficiently complete S, for any tegal ground terms e' v e' 2 of 
type D' € A constructed using the symbols in US'), if neither ' e[ s e' 2 € Tb(S') nor 
' e[ ± e' 2 ' € Th(S'), where S' is a specification of D\ * e[ 4 e\ ' t TfcfS). 

Proof By contradiction. 

Suppose ' e[± e' 2 ' € Th(S) meaning that e[ and ej are distinguishable by S (as well as by 
S') (by Theorem 4.2). By Theorem 4.6, ' e[ £ e' 2 ' € TH&), Whicrr is not the case. So the 
theorem. I 

However, we could have a specification S such that 'cj = e' 2 € Th(S) in the above case. The 
following property of a specification rules out such cases. 



143 



Def. 4.8 A specification S is well defined if and only if for every D' € A U A t , assuming that 
S' of D' is well defined, Th(S) 1^ = Th(S'). ■ 

We are usually interested in well defined and complete specifications. 
Behaviorally incomplete specifications are occasionally of interest. Set-Int' is such an 
example. 

4.2.7 Automation of IND(S) 

Recently Musser [61] has discussed how to automate IND(S) when S satisfies 
certain conditions. If (i) S is consistent and complete, and (ii) the nonlogical axioms 
derived from S can be written as equations (possibly using if-then-else operator), then the 
Knuth-Bendix algorithm, which treats equational axioms as rewrite rules, can be used to 
derive an equational formula % e = e 2 in the inductive subtheory IND(S). The equation 
'e = e 2 is input to the algorithm as a rewrite rule to get a new convergent set of rules 
having the added rewrite rule. There are three possibilities: 

(i) The algorithm succeeds implying that the new equation is consistent with the 
nonlogical axioms and thus provable, 

(ii) an inconsistency, such as ' e[ -* e' 2 ' where e[ and e' 2 can be proved to be not equal, in 
particular T -» F or 'F -» T,' is generated as a rule, implying that the equation is not a 
theorem, and 

(iii) the algorithm does not terminate implying that (a) an additional lemma be proved 
first, which could be guessed from the set of new rules generated, (b) the specified ordering 
on terms used by the algorithm does not work, and some other ordering needs to be tried, 
or (c) there does not exist a finite convergent set of rules to express IND(S). 

The basis of deducing from (ii) that 'g = e ' is not a theorem is the consistency of S and the 
method of proof by contradiction; in fact 'e ± e 2 is a theorem in IND(S) in this case. The 
basis of deducing from (i) that 'e = e ' is a theorem in IND(S) is the completeness of the 
specifications: For a substitution of all variables in e and e 2 by ground terms, the resulting 
ground terms e[ and e' 2 have the property that either ' ej = e' 2 ' € IND(S) or 
'£>;=£e 2 '€lND(S). 



144 



4.3 Theory of Exceptions Without Nondeterminism 

We now incorporate the exceptional behavior of data types into their theories 
with the assumption that specifications do not specify nondetcrministic operations. New 
atomic formulas are introduced to express the exceptional behavior of the operations. We 
describe how the nonlogical axioms of Th(S) can be derived in this case from a 
specification S. We discuss how to construct EQ(S), DS(S), IND(S), and Th(S). New 
'logical' axioms characterizing the exceptional behavior of the operations are presented. 
We extend the properties of a specification discussed in the previous section to 
specifications specifying the exceptional behavior. For illustration, we modify the 
specification of Set-lnt' so that the operation Choose is required to signal no-elemenl() on 
the empty set; let Set-lnt" stand for the modified Set-lnt. &* instead of the Restrictions 
component specifying a precondition for Choose, it specifies a required exception 
condition as follows: 

#(s) = => Choosers) signals no-elemeatQ. 
We also use the specification of Stk-Int. 

Besides the operation symbols and auxiliary function symbols, the language L(S) 
also includes the names of exceptions signalled by the operations as specified in S. 
Exception terms are constructed as discussed in Chapter 2, using terms and exception 
names. There are two new sets of atomic formulas in addition to equations: 

(a) e signals ex/, 
where e is a term, ext is an exception term, and every variable in ext is also in e; and 

(b) ext x = ext v 
where ext and ext 2 are exception terms. Hie predicate 'signals' is similar to s but its arity 
is(DuEXV)xEXV. 

As in the previous section, we first discuss the derivation of the nonlogical axioms 
of Th(S) from S. Then, we discuss the subtheories EQ(S), DS(S), and IND(S), and the full 
theory Th(S). In the last subsection, we extend sufficient completeness, completeness, and 
well definedness properties. 



-145- 



4.3.1 Derivation off Nonlogical Axioms 

The nonlogical axioms of Th(S) are derived from the restrictions and axioms 
components of the specification S in a slightly different way than discussed in 
Subsection 4.2.1. We first discuss the restrictions, and later the formulas in the axioms 
component 

4.3.1.1 Restrictions Component 

From a restriction specifying a required exception signalled by an operation a, 
KjiX) => a(X) signals ext, 
we get the following nonlogical axiom: 

P a (X) =* (R.(A) =* a {X) signals ext), 
because the restriction holds only if the input X satisfies the precondition associated with 
a. 5 For example, the restriction on the operation fop in the specification of Stk-Int, 

Empty(s) => Top(s) signals no-topX), 
is a nonlogical axiom of Tn(Stk-Int), as the precondition for Top is T. Similarly, from a 
restriction specifying an optional exception signalled by an operation a, 



we get 



a(X) signals ext =» 0{X% 



P o (X) =» (a(X) signals ext => 0{X)), 



as a nonlogical axiom. For example, the restriction on Push, 

Push(s, signals overflow(s, i) =» #(s) ^ Iw" , 
is a nonlogical axiom of Th(Stk-Int). 



5. Recall that the boolean term RXX) is an abbreviation for the formula R^X) s T. 



146 



4.3.1.2 Axioms Component 

The preconditions in the restrictions component are also used in constructing the 
nontogical axioms from the formulas in the axioms component of S. As discussed in 
Chapter 3, a variable in a formula in the axioms component cannot be freely substituted. 
When the exceptional behavior was not considered in Subsection 4.2.1, the substitution was 
conditional: The arguments to every operation in the axiom must satisfy the associated 
precondition. Now, there is an additional requirement: T1i8 istibstituttbh shoukinot result 
in an operation signalling on its arguments. 

To express the second condition, we introduce a unary auxiliary function 
N? D , : D' U EXV - Bool for every D' € A u { D } u A^ These auxiliary functions are not 
used in a specification. Informally, N? separates a normal value of D' from an exception: It 
returns T if its argument interprets to a normal vakie of D'; it returns F if its argument 
signals an exception. Furthermore, N? D <ff(e r .... ej) is F if Nt^e) is false for any e^ 
this constraint on the behavior of N? D - enables us to get a simpler transformation of the 
restricted formulas in the axioms component of S. 

Using N? D < , we transform a restricted formula in the axioms component to an 
unrestricted formula which serves as a nonlogical axiom of Th(S). If an equation *e. = e ' is 
in the axioms component, where ■« and e 2 are of type D, then the corresponding 
unrestricted axiom is 

(N?^) A N? D <6 2 )) => ((PC A PC^, ) => e x a ej, 
where PC g is a conjunction of conditions expressing the constraint that the input to every 
operation invocation in a term e satisfies the associated precondition. Similarly, if a 
restricted formula is i e l = if b then e 2 ,' then the corresponding unrestricted formula is 

< N? BooJ<*> A N V e i> A N V e 2» =* « rc 6 A rc e A rc e > "* < b "* e i s c 2»- 
If a restricted formula is 'e 1 = if b then e 2 eke e3,' then the corresponding unrestricted 

formulas are obtained using the fact that this formula is equivalent to two conditional 

equations 

e x = if b then e 2 

e x = tf~ Athene^. 



-147 



We illustrate the above transformation on the foSowing equation in the axioms component 
of the specification of Stk-lnt: 

Replaces, i) = Push(Pop(s), i). 
The corresponding unrestricted axiom is 

(N? stk . Inr (Replace(s, i)) A N? Stk . Int (Piisli(Pop<s), I))) =» 
(~ Empty(s) =* Replace(s, i) ^ Push(Pop(sK l))i 

4.3.1.3 Definition of N? D > 

A specification of D implicitly defines N?, } and extends N? D , for every defining 
type D' of D as well as any auxiliary types D' used in S. N? iy is defined by the specification 
of D'. Since an operation a has the arity Dj x . . X D n - D' U EXV, and N? |} , has the arity 
D' U EXV -» Bool, we need to introduce variables ranging over values of a type and 
exceptions to characterize N? D - . We have two options: (i) Introduce two kinds of variables 
- variables of a single type D. and variables of a union type D. U EXV, or (ii) introduce only 
variables of a union type. If we adopt the second a'ternatfve, the formulas expressing the 
normal behavior of the operations get long because we make the conditional use of the 
variables. Since we would mostly be using formulas expressing normal behavior, we have 
adopted the first alternative. Often, we do not need to have a formula in which both kinds 
of variables are mixed. Except in the axioms for N? D - and the axioms characterizing the 
general properties of the exceptional behavior of the data type, we would rarely use 
variables of a union type. Terms as well as exception terms are constructed using only 
variables ranging over a single type (except in the next section). Henceforth, we use xe, 

xe x xe nr ...,ye,ye l ye n ze, ze 1 a? n , etc., as variables of a union type, 

and exv, exv v ..., exv n , ... as variables of type EXV. 

We now discuss the axioms defining N?^. First of all, for a variable x of type D\ 
we have the axiom 

N? D ^)aT. 
For an operation a, let P (X) be its precondition. Let us assume that the restrictions 
component specifies for a, I required exceptions and m optional exceptions. For each 
1 ^ i < /, let R.(J0 be the condition on input X when a is required to signal an exception; 



-148 



similarly, for each 1 < j £ m, let Q.(A) be the condition when a has an option to signal. 

For every constructor a of D, we have an axiom defining N? corresponding to D, 
NTiXE) => ((? a (XE) A (~ R^XE) A . . . A ~ RftM®b A 

(-o^jt^a ... A~o m (jir£)))=» m*(xm)> 

where XE stands for the variables xe v .^ jre n ; #e. is a variable af union type D t U EXV, and 
N?(XE) is an abbreviation for N? D (x#j) A v. . A N? D (js*^), 

1 n 

The above axiom captures the assumption in a specification that if (i) an input to a 
constructor a is normal, (ii) the input satisfies the preconditioi associated with a, (Hi) none 
of the conditions associated with a required exception for a holds for the input, and (iv) the 
condition an input must satisfy in case a signals an exception specified to be optional, also 
does not hold for the input, then a returns a normal value. In other words, this assumption 
states that the exceptional behavior of the operations on their intended inputs must be 
completely specified by the Restrictions component 

The extension of the definition of N? n . for every D' € A is also captured by a 
similar set of axioms corresponding to every observer a € of result type D'. There is an 
axiom having the above structure corresponding to every observer o in 0. 

In addition to the above axioms, we have a rule for every operation and auxiliary 
function expressing that if any argument to a function is not normal, then the result of the 
function invocation is also not normal. 

(N? D {xe) s F V . . . V N? D (xej = F) K- N?^^, . . . , x?J) = F. 
Note that there is no axiom so far which states the condition when N?^. is F. In die next 
subsection on equational subtheory, we introduce a rule characterizing such behavior of 

We use the nonlogical axioms derived from the restrictions and axioms 
components of S, and the axioms defining JN?iv along with the additional axioms and rules 
characterizing the general properties about the exceptional behavior to build various 
subsets of Th(S> and finally Th(S) itself. 



149- 



4.3.2 Equational Subtheory 

As in case of specifications without nondeterminism and without exceptional 
behavior, we define the equational subtheory EQ(S) as a set.of atomic formulas. Besides 
equations of the kind discussed in Subsection 4.2J2, we also have the following atomic 
formulas: 

(a) e signals ex/, and 

(b) ext x = ext Y 

In addition to the rules characterizing s discussed in Subsection 4.22, we use the 
substitution rule for v, and the rules characterizing 'signals' and capturing j|he observable 
equivalence relation on exception values. The substitution* ^IfyoisV, 

(V x) *(x) =* *{x/4 " 

where x is a variable of type D', and e is a term of type D' and is substitutible for x in * [IQ, 
is modified to 

(v x) «Kx) => ( N?p(e) m T =» *[x/4 X 
since x is a variable ranging over normal values and igcan signal anexception. 

Rule (i) below says when N?^ is lalse, which is if a term of type D' signals an 
exception, thenN?^ on that term is false. Rale (4|^s^^J>ft;if|twOit#iTOS;W:e<ibs^va^ily 
equivalent and one signals an exception, then the other aiso, signals the same exception. 
Rule (iii) states that if a terra sgftju>twja»ex<xrj^r^ 

equivalent Rule (iv) states how the observable equivalence relation on exception values is 
related to the observable equivalence relations on normal values, 

(i) xe signals exv H N? B< (xe}» F, 

(ii) xe x a xe T xe x signals exv \- x<? 2 signal ex v 

(iii) xe signals exv v xe signals exv 2 l- exv 2 = exv 2 , and 
for every exception name ex of arityOj X ... x D b , 

(iv) x n = jfy, . . . , * ln = x^ H exCx^*. .^^jJlpj^eCi^^^vX^ 
It should be obvious that the above rules are sound under the following interpretation: In a 
type algebra A, for a ground term e and a ground exception term exu me formula 
l e signals ex/' is interpreted as: The interpretation dfeih A ii the exception value that is the 
interpretation of ext in A. For two ground exception terms exl^ and ex/ 2 , the formula 



-150 



'ext 1 = ext 2 ' is interpreted as: The interpretation of ext t is observably equivalent to the 
i nterpretation of ext 2 in EX V of A. 

We now show how to use the above rules along With the nonlogical axioms and 
the axioms and rules defining N? D , , to prove some properties of data types. Since many 
nonlogical axioms and formulas are conditional having the form 

(7) b=> e signals ext, 
where b is a boolean term, we use a trick similar to the one used in Subsection 4.2.2 to deal 
with such formulas so that they can be proved in EQ(S). We introduce an auxiliary, 
function if-then : Bool x EXV x D' - IX u EXV having the behavior defined by the 
following axioms: 

if-then(T, ext, e) signals ext 

if-then(F, ext, e) s e. 
Using the auxiliary function if-then, the formula (7) is equivalent to 

e = if-then(£, ex/, e), 
as for an instantiation of the variables in (7), if b interprets to 1, then (7) is equivalent to 
'e signals ext: The boolean term b must not signal 

As an illustration, we prove from the nonlogical axioms of Stk-Int that 
Top(Null) signals no-topO € EQ(Stk-Int) in Figure 4.10. Similarly, we can prove 

ToiHPoKFusNNirtt, i))) signals no-topO. 

Reflaee(Piis«^islKNull, fl),12), B)) s Push(Piish(Null, it), B). 
However, 

Replace(Push 101 ((NuH 1), . . . , 101), 0) =s Push ,0, (((Ntih\ 1), ., . , 100), 0) 
is not derivable because we cannot derive ' N ?g tk .| | ^l.h:S.> p T due to the optional 
exception specified for Push when its stack ailment is of size > 100. But we can prove the 



Figure 4.10. Proof of 'Top(NuH) signafc no-topO' 

1. Top(s) = if-then(Empty(s), no-top(), Top(s)) Restriction on Top 

2. Emply(NuH) = T Axiom 4 

3. if-thcn(Eimpty(Null), no-top(), Top(Nul!)) signals no-topO Ax jom of if-then 

4. TbpfNull) signals no-topO Substitution in 1, and rule (ii) above 



151- 



following formula: 
N? stk ., nt (Push 101 ((NuH, 1), . . . , 101)) -» 

Rcplace(Push 10, ((NuH, 1),..., 101),0)a R*sh 101 («NuM, 1), . . . , 100), 6). 

The formula 

Pop(NttH) = Null 

is not derivable because of the precondition on Pop. 

It would be interesting to investigate the conditions under which 

(i) an axiom of the form 'e signals exf can be treated as a rewrite rule 'e -* exf and the 
Knuth- Bendix algorithm be applicable to such axioms, and 

(ii) a conditional formula involving signals can be rewritten as an equation using the 
if-then and if-lhen-ebe operators so mat the Knuth- Bendix algorkhm is applicable to 
conditional formulas also. 

4.3.3 Distinguishability Subtheory 

As in case of specifications without nondeterminism and without exceptional 
behavior, DS(S) is defined to be a set consisting of atomic formulas and the negations of 
atomic formulas. DS(S) includes EQ(S) as well as formulas having the following structure: 

(b) ext x ^ ext y and 
(c)esigrfalsexf, 
where 'e sigiials exf is an abbreviation for '~ (v x r ..., xjlj^sjgnals ef(J|' such that 
x , . . . , x are all the variables in the formula 'e signals exC Besides the axioms and rules 
of inference of DS(S) discussed in Subsection 4i2J, we ham the following additional 
axioms and rules expressing properties about the exceptional behavior of data types which 
enable us to prove formula having the above structure, 
(v) for every exception name ex : D l x . . . x D B , 

( ~ x u = x n V . . . v^sjfjh* exi^ . . . , x ln ) m ex(x 2V . . . , x^). 
(vi) for different exception names ex 1 :D J x... x D n and ex 2 : B[ X x D^ in L(S), 

*** £^v^j» • • • » ■*!„' = ^■"'2' 21* * * * * Tar* 

(vii) for a union type D' U EXV, 



152- 



N? D <xej) = T, N? D ixeJ sFh-xe^ xe 2 , 
where xe 1 and xe 2 are of type D' u EXV, and 

(viii) N^x^s T J- ~ (v«xv)| xe signab «*J 
Rule (v) and axiom (vi) capture the distinguishability relation on exception values. Rule 
(v) is the opposite of rule (iv) given in the previous subsection; it states^hat'twe exception 
values having the same name are distinguishable if any of the arguments in one value is 
distinguishable from the corresponding argument in the other value. Axiom (vi) states that 
two exception values are distinguishable if their exception names are different Rule (vfi) 
states that two values are distinguishable if N?,y holds lor one and doesflot hold for' the 
other. Rule (viii) says that W N? D , holds for a term, then it cannot signal an exception. The 
above axiom and rules are clearly sound. Note that these rules eon be used to derive 
formulas having the structure *~ xe } = xe y ' which implies that 'xe 1 4 ^xi^-y- 

We can derive from the nonlogical axioms of Stk-Int using rule (vii) that 
(8) Top(Nu!IM*. iK-vsr.-:- -> ; ;-:^,--.;, 

because Top(Null) signaLs no-topO/ N? |nt (/) = T, and NTj^TolKNun)) s F € 
PS(Stk-Int). The formula 
overflows, ,fy 4 no-topO 

is immediate from the axiom (vi) above. Using the theorem (8) in DS(Stk-Int), we can 
prove by contradiction that 
Null^Push&i). 

4.3.4 Inductive Sub theory 

The inductive subtheory INtKS) can be constructed asUn Subsection 4.2.4; we 
can also use the above axioms and rules characterizing the exceptional behavior. Hie 
induction rule (|) in Subsection 4.2.4 hn&fd be?m0dilie4ri»Sie*d©fs«quirmg that for every 
constructor ground term e of type D, *{jf/^ fce derivable in the premise, we orrty need to 
consider constructor ground terms for which *N?|j(ie| jgV- isderivable. So, we have: 
Modified * Induction Rute 

Given a formula *(x) with a free variabfexof type Di 
For every constructor ground term e of type D, N? D (e) a T** *(jc/ej I- (Vx) *(x). 



153 



We can use the methods discussed in Subsubsection 4.2.4.3 to establish the infinitely many 
premises. 

As in Subsubsection 4.2.4.4, if a specification S specifies nontrivial preconditions 
on constructors, then the above formula can be simplified to 

for every legal constructor ground term e of type D, N?jj(e) = T =* 4>[x/6J 
H (Vx)*(x), 
because of the assumption about the semantics of a constructor on inputs not satisfying the 
associated precondition, discussed in Chapter 3. 

For example, for Stk-Int, the induction rule is: 
For every legal constructor ground term e of type Stk-Int, 
N?(e) s T =* *{s/e\ \- (V s) *(*). 
The above rule can be simplified using the following theorem in a way similar to Set-Int' in 
the previous section: 

Ttan. 4.9 Every legal constructor ground term e of type Stk-Int such that 
N? Stk-lnt^ - T € EQ(Stk-Int), is equivalent by equational reasoning to another legal 
constructor ground term e' having only Null and Push, i.e„ if 'N^cjjj.j^Ce) s T € 
EQ(Stk-Int), then ' e = e> ' € EQ(Stk-Iirt). 

Proof By induction on flie number of Popi and Replace in a constructor ground term e 
using axioms 1 and 3 in Figure 4.7. See the details in Appendix HI. I 

The simplified induction rule is: 

(9) For every legal constructor ground term e of type Stk-Int having occurrences of 
Null and Push only, NT^.j^e) s T =* «Ks/^ H (Vs)*(s). 

4.3.5 "The Full Theory 

The full theory Th(S) is also constructed in a similar way as for data types without 
exceptional behavior. For example, we can prove the normal form theorem using the 
simplified induction rule (9): 

s s NulK) v (a s\ T)[ss PushCs', J. 



154 



The diagram summarizing the relationships among different subtheories for 
specifications not specifying exceptional behavior on p. 135 also holds in this case. 

For the extended deductive system, the following extension of 1neorem4.2 
holds: 

Thm. 4.10 (i) For any two ground terms e and e 2 of the same type, We x s e 2 € Th(S), 
then e 1 and e 2 are observably equivalent by S and if t e 1 a£ e 2 € Th(S), then e and e % are 
distinguishable by S, 

(ii) for a ground term e and a ground exception term ext, if 'e signals exf € Th(S), then 
the interpretation of ein every model A in F(S) is the interpretation of ext in A, 

(iii) for two ground exception terms exi l and ext Y if , ext i s. ext 2 € Th(S), then ex/j and 
ext 2 are observably equivalent by S, and if 'ext y d ext 2 € Th(S), then ext x and ex/ 2 are 
distinguishable by S, and 

(iv) for any ground term e, if *N?(e) s T € Tb(S), then the interpretation of e in every 
model A in F(S) is a normal value, and if -N?(e) = F € TI<S), then the interpretation of e 
in A is either an exception value or undefined. 

Proof The theorem follows from the facts that 

(a) the nonlogical axioms of Th(S) hold in every model in I^S), 

(b) the observable equivalence relation used as the interpretation of a is a congruence, 

(c) the exceptional behavior of an operation is completely specified by the restrictions 
component of S on inputs satisfying its preconditions, and 

(d) the axioms and rules defining N? and characterizing the exceptional behavior holds 
in every type algebra. I 

We demonstrate how the full theory constructed from a specification S can be 
used to prove properties of programs using the data types specified, by S. F%ure 4.11 is 
another implementation of union procedure using Choose in a CLU-like language. In this 
implementation, an element of the first set argument to union is successively selected using 
the operation Choose, removed from the copy of the first argument, and inserted into the 
copy of the second argument until the operation Choose signals no-element, indicating that 
the set is empty. The handler for no-element associated with the loop is then invoked. In 



155- 



Figure 4.11. Procedure Unioii - II 

union = proc(sl, s2 : Sct-Int") returns (Sct-Int M ) 
i:lnt 

rl: Sct-Int" := si 
r2: Sct-Int" :=s2 

{ rl = si A r2 s s2 } 
while true do 

{ (Sizc(rl) = = F A IN(Rcmovc(rl, Choosc(rl)), Inscrt(r2, Choosc(rl)), si, s2)) 
V (Sizc(rl) = s T A r »*■(*». &) ) } 
i:= Sct-Int"$Choosc(rl) 

{ IN(Rcmovc(rl, i), Insert(r2, i), si, s2) } 
rl := Sct-Int"$Rcmovc(rl,i) 
r2 : = Set- Int"$Insert(r2, i) 
(IN(rl.r2,sl,s2)} 
end except when no-clement : 
end 

{R unlo (sl,s2) } 

return (r2) 
{R} 
end union 



IN(rl, r2, si, s2) = (V j) {{Has(sl, j) V Has(s2, j)) ~ (Has(rlJ) V Has(f2, j)) s T 1 A 
(Sizc(rl) + Size(r2)) < (Size(sl) + Sizc(s2)) zTA Sizc(r2) >OaT) 



I/O Specification for wkm 

T=>R, whereR = RlAR2,and 

Rl = (V 1){(has(sl, i> V Ha*s2, i)) « H*s(union<sl, s2), 0« Tl 
R2 = Size(union(sl, s2)) <£ Sizc(sl) + Size(s2) = T 



the code, we have included formulas within '{ }\ that express relations among different 
variables at that point in the code. The Floyd-Hoare inductive assertion method for 
proving properties of programs [17, 36, 55] can be extended tp incorporate the exceptional 
behavior of programs. A statement in this case can terminate in more than one way - either 
normally or by signalling an exception. Corresponding to every possible way of 
termination of a statement, we associate an input formula for an output formula. 

Figure 4.11 includes the input-output specification of union. We use the 
following notation for specifying a procedure F(X): Corresponding to every possible 



156 



outcome of F on an input X, there is a formula relating the injwrt to tke outcome. Simc F 
can terminate normally or by signalling an exception, we specify the weakest input 
condition for normal termination, as well as for every exception signalled by F. 

TCjW => ¥(X) signals ext x 



TC (A) => ¥(X) signals ext 
TC m+1 (*)=> R<*/), 

where TC^, . . . , TC m+l (X) t and R are first order formulas, and r stands for a possible 
result returned by F on the input X. TC.JLX) => ¥(X)sigtiatsfeXt^ is interpreted as: The 
weakest input condition for F to terminate by signalling ext^ is TC.(A"). 
TC m+1 (JO => R(X, /)' is similarly interpreted as: The weakest input conaMtiojji for F to 
terminate normally returning a value r such that R(X, /) holds is TC m+l (X). If F is 
deterministic, then such an r is unique for every X; otherwise, there can be many i's such 
that R(X, r) holds. Instead of using r as deiiotirig a lesirlt returned by f on X, we can also 
useF(*). 

The formula iN(rl, r2, si, s2)' is used as an invariant of th§ipop in the program 
in Figure 4.11. Using the backward substitution semantics of therCPHti$ structures, we can 
generate the verification conditions and show the required formulas La be in TMSet-Int"). 
The partial correctness proof of union is complete ffS^ can show that 

IN(rl,r2,sl,s2)=> 
(( Size{rl) = = F A IN(Remove(rl, Choose(r1)), Insert(r2, Chooser 1)), si, s2) ) 

V (SMrl) = ■ F A tij*** 1 * 1 )) 
To prove the above formula, we need the theorem 

Size(r1) > a T => Size(Reniove(r1, Chooser I))) + 1 = Size(rl). 
The while loop terminates because each time in the loop, Size(rl) is reduced, and 
Cnooscfrl) signals no-elcment when Size(rl) = = t. 

An alternate approach to the Floyd-Hoare method of reasoning about programs 
is to use the first order semantics of control structures as suggested by Cartwright and 



157- 



McCarthy [8]. They have shown how reasoning about recursive programs can be 
completely carried in first order logic. The definition of a recursive program can be 
considered as an axiom defining the function computed by the program with an 
appropriate condition on variables. The termination of such a program can also be proved 
by adding a minimization scheme corresponding to its function. For example, the above 
iterative union program can be transformed to an equivalent recursive program, and the 
axiom characterizing the function computed by the program is derived from the recursive 
program. Th(Set-Int") is enriched by adding this axiom about union and a minimization 
scheme corresponding to union. The input output specification of union can then be 
proved as a theorem in the enriched theory. We use a similar approach in the next chapter 
in showing the correctness of an implementation. 

4.3.6 Properties of a Specification 

It should be clear from the discussion in the previous subsections that the 
following extension of Theorem 4 J holds: 

Thm. 4.11 For a consistent S, 

(i) for any ground terms e x and e 2 of the same type, both i e l s e 2 ' and i e l d e* cannot be 
inTh(S),and 

(ii) for any two ground exception terms ex^ and exi y both 'ext x = exi± and l ext x d ext^ 
cannot be in Th(S), and 

(iii) for any ground term e, both 4 N7(c) = T and *N?(e) s F cannot be in Th(S). I 

We extend the definitions of sufficient completeness, completeness, and well 
definedness properties discussed in Subsection 4.2.6 to the specifications specifying 
exceptional behavior. The results about these properties in Subsection 4.2.6 directly extend 
when the modified definitions are used 



6. The condition is that a variable is instantiated to a value of its type other than J., which is used to denote 
non-termination. 



158- 



4.3.6.1 Sufficient Completeness 

Recall that the sufficient completeness property as defined in Subsection 4.2.6 
requires that the behavior of the observers on any intended input should be deducible by 
equational reasoning. When a specification specifies data types haying operations which 
signal exceptions, then the observable behavior of the operations also includes their 
exceptional behavior. Two values of a data type can also be distinguished in this case if a 
sequence of operations signals one exception on one value and does not signal on the other, 
or if the sequence of operations signals different exceptions on different values. In the 
extended definition of sufficient completeness, we want to capture the intuition that in 
addition to the normal behavior of the observers, a sufficient complete specification must 
also completely specify the exceptional behavior of the operations when their input satisfy 
the associated preconditions. 

If a specification has only required exception conditions for the operations, then 
the above amounts to requiring that 

(i) for any legal ground term e, either *N?(e) m T € EQ(S) or *N?fc) s F € EQ(S), and 

(ii) (a) if N?(e) = T' € EQ(S) and e is of type D' € A, then thejrondition stated in Def. 46 
must be satisfied (i.e., there is a ground term e" not having any operation symbol of D or 
auxiliary functions used in S such that 'e = c" € EQ(S)), and 

(b) if *N?(e) .s F € EQ(S) and for every subterm e t of e, 'N? D ^) m. T € EQ(S), then 
the formula 'e signals exi € EQ(S) for some ground exception term ext. 

If S specifies optional exceptions also, then there are legal ground terms for which 
neither 'N?(e) = T nor 'N?(e) s F is provable. For example, we can neither prove 

N? |nt (Top(Push ,01 ((Nyn, I), . . . , Wt)» = T 
nor 

N? lnt (Top«Push 10, (((Nuli, 1), . . . , 101))) » F 
from the specification of Stk-lnt. For such a specification, the definition of sufficient 
completeness must include the condition , that for such a ground term, if we assume 
'NT.yte) s T,' then ' e s e" is derivable using equational reasoning." This condition is 
based on an aspect of the semantics of a specification, namely that if an operation does not 
signal on an input for which it had the option to signal, then the formulas in the axioms 



159 



component for the operation behavior must hold. 

Def. 4.9 A specification S is sufficiently complete if and only if 

(i) for every e of type D' € A, if 'N?(e) = T' € EQ(S>, ihen there is a theorem ' e & e? ' € 
EQ(S) for some d, a ground term of type D' not having any operation symbol of D and 
auxiliary function in S, 

(ii) for every e (= o(e v . . . , e n )) of type D' € A U { D }, if 'N?(e) = F € EQ(S), and 
'(N?^) A ... A N?(e n )) = T € EQ(S), then there is a theorem V signals exf € E(KS) for 
some ground exception term ext, and 

(iii) for every legal ground term e of type D' € A U { D }, if neither 'N?(e) = T' € EQ(S) 
nor 'N?(e) = F € EQ(S), then there exisjsas^bterrn ^ of £such that e 2 == a(e u , . . ... c ln ) 
and 'OfXj/^j, . . . , x/e Jn ] == T' € EQ(S), where a is specified to optionally signal if its 
input satisfies Ofy, , . . , * n ), and assuming l W(e) s .1? there is a theorem ' e = e" € 
EQ(S U { N?(e) s T }), where e 1 is aground term of %pe D' having no operation symbol of 
D and auxiliary function used in S. ■ 

S U { f } stands for the nonlogical axioms derived from S plus the formula f, and 
EQ(S U { f }) stands for the equational subtheory derived using S U { f } as the nonlogical 
axioms. The condition (iii) above amounts to proving the theorem assuming 'N?(c) = TV 

For example, Stk-Int is sufficiently complete. Top(Null) signals no-topO' € 
EQ(S). Assuming 'N^CropCPiislt^iHNiilU''' i),- 1 ."*.*'"!©!))) '^T/ we can derive 
Top(Push 101 ((Nu»l, 1), . . . , 101)) s JO!' in EQ(S). 

The specification of Set-Int" is not sufficiently complete, because, for instance, 
though 'N? lnt (Choose(Insert(Inscrt(Null, 0), 1))) m T € EQ(S), there does not exist any 
ground term e 1 of type Int not having any operation symbol of Set-Int" such that 
'Choose(Insert(Insert(Null, 0), 1)) = e n € EQ(S). 

The results discussed about specifications not specifying exceptional behavior in 
Subsection 4.2.6 directiy extend to specifications specifying exceptional behavior when 
appropriately modified. We have 



-160 



Thm. 4.12 If S is sufficiently complete, then S is behaviorally complete. 
Proof See Appendix III. I 

The obvious analog to Theorem 4.5 also holds; its converse is a conjecture analogous to 
Conjecture 4.1. We also have 

Thm. 4.13 For a consistent and sufficiently complete S, if any two legal ground terms e 1 
and « 2 of type D are distinguishable by S, then 'Cj ^ e 2 '€ DS^S). 

Proof See Appendix III. I 

4.3.6.2 Completeness and Well Definedness 

The completeness property of a specification can be defined in this case in the 
same way as in Subsection 42.6. Def. 4.7 in Subsection 4.2.6 works for the case also. 
Theorem 4.7 for this case can be proved in the same way as for specifications without 
exceptional behavior. It can be shown that the specification of Stk-Int is complete, whereas 
the specification of Set-Irit" is not complete. 

The well definedness property is also defined in the same way as in case of 
specifications without exceptional behavior. Def. 4.8 in Subsection 4.2.6 is valid. It can be 
shown that the specifications ofSet-Int" and Stk-Int are well defined. 



-161- 



4.4 Theory of Nondeterminism 

In this section, we discuss specifications specifying nondeterministic operations. 
Again, we first discuss specifications without exceptional behavior; later, we incorporate 
the exceptional behavior also. For the first part, we modify the specification of SeMnt' 
given in Figure 4.1 so that the operation Choose is specified to be nondeterministic. Let 
SeMnt'" stand for the modified specification. In the second parW we use the specification 
of Set-Int given in Figure 3.1. 

We find it convenient to express properties of a data type with nondeterministic 
operations as formulas using nondeterministic operation symbols (which is also the reason 
to allow a specification to have such formulas in the axioms component), but such a 
formula must be interpreted properly. A nondeterministic function symbol does not have 
the substitution property with respect to = unless interpreted properly. We discussed this 
in the previous chapter; we will repeat the discussion here. For example, the formula 
'Choose(s) € s = T in the specification is to be interpreted as any integer returned by 
Choose on the argument s is in the set s, The formula 

si = s2 => Choose(sl) = Choose(s2) 
need not hold if 'Choose(sl) s Cboase(s2)' is interpreted as an integer returned by Choose 
on si is the same as an integer returne'd by Choose on s2, because different invocations of 
Choose on the same argument may return different integers. However, if we interpret 
'Choose(sl) = Choose(s2)' as for every possible integer returned by Choose on si, Choose 
on s2 can return the same integer, and vice versa, then the formula 

si = s2 => Choose(sl) = Choose(s2) 
holds. We adopt the latter interpretation, so that the substitution property continues to 
hold. The adopted interpretation is consistent with the definition of observable 
equivalence on ground terms involving nondeterministic operations induced by S, given in 
Sections 2.2 and 2.3. 



7. As is discussed in the previous chapter, the reason for rejecting the former interpretation is that the 
formula 'bfXj x ,)— °l x i ■*„)' *° r a nondeterministic symbol a is almost always false under it 



■*%- 



162 



We cannot however express many interesting properties about a data type 
because in a formula involving a nondeterministic operation symbol a, different 
occurrences of a ternr*^, . . . , e ) may result in different values. We Often need to express 
properties in which different occurrences 'of the ternroCVj ., e ) Stand for the same value. 
For example, consider another version of the union procedure given in Figure 4.12, which 
is a slight modification of the version given in Figure 4.11. In this case, the while loop has 
the condition '~ (#(s) = flfc' instead of 'true' ih Figure 4.11. In verifying this version Of 
union, we must use the properties of i, a result returned by Chooser In such a case, we', 
introduce an auxiliary function a_p : f) l X . . . x D x D* -» Bool corresponding to the 
nondeterministic operation a, which is the relation describing the behavior of a. 
(*) ajp{jc v ...,x a ,y) a It if a can return y as a possible result on x v . . . , x n , 

£F otherwise 
For example, we introduce Choose_p for Choose and use Choose_p to express a property of 
r, a result returned by Choose. 

Since formulas in the axioms component of £ are expressed using 
nondeterministic operation symbols, we transforrri them to equivalent formulas havingorily 
deterministic symbols using the auxiliary functions corresponding to the nondeterministic 
symbols. We discuss the transformation procedure TR below. lj(S) now also includes the 
auxiliary function a _p corresponding to every nondeterministic operation symbol a. the 
transformed formulas have a restricted interpretation just as the original formulas in the 
axioms component, so we derive unrestricted formulas from tire transformed fb^JJlll 
using the method discussed in Section 4:2 for specifications with deterministic operaaows: 
The precondition specified by a nondeterministic operation <rii taken as the precondition 
for the corresponding auxiliary function <r_p. So in the specification of SeHnt"', 
'*» #(s) = 0' is the precondition for Choose_p. The unrestricted formulas serve as the 
nonlogical axioms of S. To prove a formula / involving nondeterministic operation 
symbols, we first transform /using TR, and then prove TR(f) from the nonlogical axioms 
ofS. 

Thie transformation procedure TR must embed the semantics of S assumed in 
Chapter 3. Recall that the semantics of S only requires that for every data type in D(SX the 



-163- 



semantics of S, an operation specified to be nondeterministic must return an appropriate 
value on every input; the operation in every data type in D(S) need not have the maximum 
amount of nondeterminism specified by S. 

4.4.1 Transformation Procedure TR 

We first describe the procedure TR and later verify that TR(f) is semantically 
equivalent to / Before describing the transformation procedure, we illustrate it using 
examples. Consider the following formula in the axioms component of Set-Int"': 

Choosc(s) € s = T 



Figure 4.12- Procedure Union • III 

union = proc(sl, s2 : Set-Int"') returns (Set-Int'") - 
i : Int 

rl: Set-Int"' := si 
r2: Set-Int'" :=s2 

{ rl s si A r2 m s2 } 
while ~ Sct-!nt"'$Size(rl) = Odo 

{ Choose_p(rl, i) == T A lN(Rcmove(rl, i), Insert(r2, i), si, s2) } 
i:=Set-Inf"$Choosc(rl) 

{ IN(Rcmovc(rl, i), Insert(r2, i), si, s2) } 
rl : = Sct-lnt'"$Remove(rl, i) 
r2:=Set-Int'"Mnsert(r2,i) 
{IN(rl,r2,sl,s2)} 

end 

i o union(sl, s2) \ 

return (r2) 

{R} 
end union 



IN(rl, r2, si, s2) = (V j) ( (Has(sl, j) V Has(s2, j)) ~ (Has(rl, j) V Has(r2, j)) a T] A 

(Size(rl) + Sizc(r2)) < (Sizc(sl) + Size(s2)) m T A Sizc<r2)> OsT 



I/O Specification for union 

T => R, where R = Rl A R2, and 

Rl = (V i) [ (Has(sl, i) V Has(s2, i)) « Has(union(sl, s2), i) = T] 
R2 = Sizc(union(sl, s2)) ^ Sizc(sl) + Size(s2>a T 



-164 



The above formula states that every value returned by Choose is in the set s. The 
transformed formula obtained after applying the procedure would be 
( (v i) [ ChoosejKs, i) s T => i € s s T 1 A (3 i) Cboose_p(s, i) & T ) 
The second conjunct states that Choose returns at least one value on every input The 
unrestricted formula, which serves a nonlogical axiom of Set-lnf", is obtained using the 
precondition for Choose; it is given below: 
( (V i) l~ # (s) = = T => (Choose_|Ks, i) m T *» i € s s T )J A 

(3 i) I ~ #(s> = s T => Choose jKs, i) s T ) 
Let us consider another formula ' Choose(sl) s Choose(s2). , This states that for every 
value returned by Choose on si, there is an observably equivalent value returned by 
Choose on s2, and vice versa. TR transforms this formula to 
~ ( (v il) [ Choose_p(sl, il) » T =* (3 12) ( Choose_ji<s& 12) A i! si? \] A 
(v 12) I Choose_p(s2, i2) m T => (3 i!) f Choosers!, it) A il s i2 ] ] ) 
We now present the transformation procedure TR, which is defined inductively 
making use of the structure of a formula 
Basis /is an atomic formula *e s e^ 
(a) /does not have any occurrence of a nondetemuiiistic operation symbol: 

TR(/") £ / 
(b) both Uj and e 2 have occurrences of non deterministic operation symbols: 
We wish TR(f) to roughly express that for every instance pf tte free variables in/ for 
every possible choice made about the invocations of the nondeterministic ^operation 
symbols in e r there are choices for the invocations of the nondeterministic operation 
symbols in e 2 such that the instantiations of e x and e 2 return equivalent results, and vice 
versa. 

TRC^ s e 2 ) has the following structure: 

< v z v- z m M c i ~* < 3 y* - - • • yji % A e i s e 7 1 ) A 

(v^,...,^)[c 2 ->(3 2 r ...,z m )[c 1 Ae; s ^Jl 

where z x z n are new variables such that corresponding to each occurrence of a 

nondeterministic operation symbol a in e v say the occurrence <<£.,. . . , e ), there is a 
variable z to stand for the possible result returned by a on its input The formula c. is a 



165 



conjunction of the equations of the form l oj{e iV .... e n , z.) = T\ stating conditions on z. 

Similarly for e 2 , new variables y ] y are introduced, and c 2 is obtained from e r e' Y and 

e' 2 are obtained from e x and e 2 respectively, by substituting z v . . . , z m and y v . . . , y for 
subterms having nondeterministic operations as the outermost operation in e l and e 2 
respectively. We discuss later how c 1 and e' x are constructed from e v and c 2 and e 2 are 
obtained from e Y 

(c) only one side of the equation 'e x = e 2 ' has occurrences of nondeterministic operation 
symbols. Without any loss of generality, we assume that only the Lh.s. has occurrences of. 
nondeterministic symbols. 

Construct c l and e[ from e y as discussed above. Then, 

TRCe 1 = 6 2 ') = (Vz r ...,z m )[c 1 => e' 1 = e 2 ]A(3z v ...,z m )c 1 
This completes the basis step of the definition of TR. The second conjunct is to ensure that 
there is at least one value returned by e v 

Inductive Step 

Since all other logical symbols can be expressed in termsof A, y and y, we define 
how TR works on formulas having these symbols, 

(a) if/is ~/ r then TR(/) = ~ TR^) 

(b) if/is/j A/ 2 , then TR(f) = TR(/J) A TR<4) 

(c) if/is (v.x)/ r then TR(/) = (v x) TR(^). 
This completes the definition of TR. 

For instance, a conditional equation "b =*> e r s e 2 ,' where 6 is a boolean term, is 
transformed to 

b =* TRfCj = e 2 % 
if 6 does not have any nondeterministic operation symbols. If b has nondeterminisuc 
symbols, then the conditional equation is transformed to 
TR(*6sT')=»TR('e 1 3^') 
= ((Vz 1 ',...,z k ;)[c=>^ = TjA(3z 1 ',...,z;)6)=* TRfe^V 
Since such a 6 is assumed to behave deterministically (See Section 3.1), i.e., for an 
instantiation of the free variables X in the conditional equation, 6 interprets either to T or 
to F, the above formula agrees with the interpretation of a conditional equation assumed in 



166 



Section 3.2 on the semantics of a specification. 

We now describe how to construct c and ef from a term e by induction on the 
number of occurrences of nondeterministic operation symbols in e. Let k stand for the 
number of occurrences of nondeterministic operation symbols in c. 
Basis k = 1 
Let e 1 = a{e\ , . . . , ^ ) be the subterm of e having the nondeterministic operation a as 
its outermost operation symbol. Then c is Vj<e} , . . ' . , e\ , 2j) s T and d is obtained by 
replacing e l in e by z y The type of z x is the range type of o. 

Inductive Step Assume c and e' can be constructed if e has k' < k occurrences of 
nondeterministic symbols. Show fork. 

(i) 1 f e has the subterm having k occurrences of nondeterministic operation symbols, 
let the subterm be e 1 = o(e v . . . , e ), where a is a nondeterministic operation symbol. 
Each e. has less than k occurrences of nondeterministic operation symbols. By the 
inductive step, let c r .... c be the formulas obtained by applying this procedure on 
«,,..., e n respectively, and let ej , . . . , ** be the terms Obtained by replacing subterms 
having nondeterministic operation symbols by new variables m e^ ... , e n respectively. 
Then 

c = ojf(e\ , . . • , e\ , z.) = T A c. A . . . Ac, 
and c' is obtained by replacing e 1 in e by z. 

(ii) There is no such subterm of e v Consider all outermost subterms of e 1 having a 
nondeterministic operation symbol as their outermost bp&atidtt; let mem be e v ..., e. 
Each of these subterms has less than k number of occurrences of nondeterministic 
operation symbols. By inductive step, let c r ..., c n be thi formulas' obtained by 
transforming e v ... , e n respectivefy, and let e[ , ... ,V be the terms obtained by replacing 
subterms having nondeterministic operation symbols by new variables in e v ..., e, 
respectively. Then 

c = c. A . . . A c , 

1 n 

and ef is obtained by replacing e v -,e n by e\ , . . . . e* respectively. 
This completes the discussion about how c and V are obtained from e. 



-167- 

Thm. 4.14 /and TR(/*) are semantically equivalent 
Proof See Appendix HI. I 

4.4.2 Th(S) 

The nonlogical axioms obtained as discussed above are used to prove properties 
about the data type. A nonlogical axiom involves existential quantifiers in contrast to a 
nonlogical axiom of a specification specifying only deterministic operations. So, the whole 
machinery of first order predicate calculus is needed to prove an arbitrary equation or an 
inequality involving nondeterministic symbols. So it is not meaningful to discuss the 
subtheories EQ(S), DS(S), and IND(S); we instead discuss the full theory Th(S). The 
formulas are proved in the same way as in case of specifications specifying deterministic 
operations only. 

As an illustration of the use of Th(S), we verify the version of the procedure union 
given in Figure 4.12. Note that the backward substitution semantics of the assignment 
statement 

i:=Set-Int$Choose(rl) 
is given as 

{ Choose_p(rl, i') = T A 4 } i := Set-Int$Choose(rl) { P }, 
instead of 

* P Choose(rl) > ' :== SeHnt$Choose<rl) { P }, 
because different occurrences of the expression Choose(rl) could possibly return different 
results. For example, the verification condition 

{ IN(Remove(rl, Cboose(rl)X Insert(r2, Chooser 1)), si, s2) } 

i : = Set-Int$Choose(rl) { IN(Removc(r1, i), kisert(r2, i), si, s2) } 
is not true, where as 

{ Choose_p(rl, i') = TA IN(Rcmove(rl, P), Iiisert(r2, f), si, s2) } 

i := Sct-Int$Choosc(rl) { IN(Rcmove(rl, i), Inserl(r2, i), si, s2) } 
is true. In this case also, iN(rl, r2, si, s2)' serves as an invariant of the loop. Using the 
backward substitution semantics of the control structures, we can generate the verification 



168 



conditions and show the required formulas to be in Th(Set-tat '), The partial correctness 
proof of union is complete if we can show that 

( ~ Size(rl) = = T A IN(rl, r2, si, s2) ) -» 

< Choose_p(rl, i) = T A IN(Remove(rI, i), Insert(r2, i), si, s2) ) 
To prove the above formula, we need the theorem 

Size(rl) > = T => Sizc(Remove(rl, Chooser!))) + 1 = Size(rl). 
The termination is also ensured because each time in the loop, Size(rl) is reduced, so the 
loop condition will eventually become false. 

We think that many properties of nondeterministic operations expressed as 
equations and inequalities can be derived from the untransformed nonlogical axioms (the 
nonlogical axioms obtained from the formulas in the Axioms component of the 
specification before applying TR) using techniques employed for deterministic operations, 
for instance, viewing equations as rewrite rules and using Knuth-Bendix algorithm for 
deriving properties. We have not investigated the extent to which this can be done. This 
hypothesis is another reason for preferring to write specifications directly using 
nondeterministic operation symbols as compared to writing them indirectly using the 
relations corresponding to nondeterministic operations. 

4.4.3 Data Types with Exceptional Behavior 

We discuss the modifications required to incorporate the exceptional behavior 
specified by the specifications with nondet«rmWi^^*^iiidik ? '* We describe hovf to 
derive the nonlogical axioms from a specification:! We use the original specification of 
Set-Int given in Figure 3.1 for illustration; the specffkatkro k repeated in Figure 4.13. 

As before, an auxiliary function <r_p is associated wttfc '««try nondeterministic 
operation symbol*. <r_p is not strict with respect to its last argument 



169 



Figure 4. 1 3. Specification of Set-Int 

Operations 



Null 


: -» Set-Int 


Insert 


: Set-Int X Int -> Set-Int 


Remove 


: Set-Int X Int -» Set-Int 


Has 


: Set-Int X Int -* Bool 


Size 


: Set-Int -♦ Int 


Choose 


: Set-Int -» Int 




-* no-elementO 



as 



as x 2 € JCj 
as #(Xj) 
nondeterministic 



Restrictions 

#(s) s => Choose(s) signals no-element 

Axioms 

Remove(0, i) s 

Remove(lnsert(s, 11), 12) = If 11 = 12 then Remove(s, i1) else lnsert(Remove(s, i2), 11) 

i€0 a F 

11 € lnsert(s, 12) = if 11 = 12 then T else II € s 

#(0) = 

#{lnsert(s, D) s if i € s then #(s) else #(s) + 1 

Choose(s) € s 5 T 



a :D x x 



x D - D' U EXV 



ejrDjX... x I\ x(D'U EXV) -^ Bool, 

aj(x v . . . , x n , ze) £ (l if N%ze)m Tand* can returns 

as a possible result on x r ..., x n , 
if N£Cze) m F and a signals ■■*<« x r . . .... x n , 

^ otherwise 

Recall that ze is of union type. 

We extend the transformation procedure TR discussed in the previous subsection. 
Besides equations, we have two additional kinds of atomic formulas: 'e signals e*/' and 
TR for equations is same as in the previous subsection except that the new 



ext x = ext 2 



variables introduced in the transformation are of union tjgje. 



170 



An exception name is treated like a deterministic operation symbol, so 
> exi l = ext^ is treated like an equation > e 1 = <? 2 '. TR is extended to treat l e signals erf- as 
'e 3 ex/.' TR is applied on 'e s ex// In the transformed formula, a subformula of the form 
' e 1 = ext ' wherever exi is an exception term and e" is a non-variable term, is replaced by 
the subformula ' e' signals exf .' Note that a transformed formula may involve terms 
constructed using variables ranging over union types. 

The restrictions on a nondeterministic operation a are transformed to get the 
nonlogical axioms as follows: A restriction specifying a required exception for a, 

R.(X) => a(X) signals ext, 
is transformed to 

P a (X) => ( R 4 (A) => „MX, ext) s T). 
For example, from the restriction on Choose, 

#(s) ss o =» Choose(s) signals no-element(), 
we get 

#(s) s= => Choose_p(s, no-element()) sT. 
A restriction specifying an optional exception for o, 

a(X) signals ext => O (JO. 
is transformed to 

P (A)=>(aj(X, C x/)=T=>0.(A)=T). 
Axioms defining N? D , are constructed the same way as for the specification with 
deterministic operations except that theft is no axiom due jte a nondeterministic operation 
a because the range of the corresponding auxiliary function a_p is Bool and not 
Bool u EXV. In addition to the axioms and rules expressing general properties of the 
exceptional behavior of the operations discussed in me previous sections, we have another 
rule. Recall that a nondeterministic operation can either signal an exception or has the 
choice to return one of many possible normal values. An Operation does not have ihe 
choice between returning a normal value and signalling an exception on the same input 
This property is captured by the following axiom for every nondeterministic operation a: 
~ ((3 ze) \a MX, ze) = T A N?<ze) = T J A (3 ze) \*MX, ze) m- T A N*(ze) = F J). 

From the formulas in the axioms component of S, the nonlogical axioms are 



171 



derived as follows: We apply TR on a restricted formula to replace nondeterministic 
operation symbol by the corresponding auxiliary functions. Since the restricted formula 
expresses the normal behavior of the operations, the new variables introduced in the 
transformation range only on normal values. So, we use variables of a single type instead of 
the union type. For instance, for an equation "e = e* having nondeterministic operations 
on both side, we get 

(\/z v ..,zJlc 1 ^(3y v ...,y p )[c 2 Ae' 1 = e' 2 })A 

(v^ y p ) [c 2 => (3 z v . . . , zJ[ Cl A e[ m e 2 ]l 

To get the corresponding unrestricted formula incorporating the exceptional behavior of 
the operations and the preconditions, we must require that 

(i) 'N? <e;) 3 T and 'N? |y (ep = T hold, and 

(ii) every operation invocation in the formula must satisfy flie associated precondition. 
The unrestricted formula for the above restricted formula is 

(v h „ zj | N? D ^;) => (PC C m. ( Cl -» 

(3 y v . . . , y p ) I N? u ie) -» (<PC C A PC e , A PC p ) -» { c 2 A e[ = e' 2 »fi)t A 

(V y v .., y p ) [ N? D <^) =* (PC C => (c] -» 

(3 z x zj[ Nt D {€[) =*\(PC C A PC^ A PC^) => (C l A < S e;»l ))j. 

A similar transformation can be obtained for a restricted formula of the form 

> e 1 = if 6 then e y ' 

For example, the formula 

Choose(s) e s = T 

in the specification of Set-Int is transformed first to the restricted formula using TR, 

((v i) [Choose_p(s, i) = T => i € s = T } A (3 i) jChoose_p(s, i) s T D, 

and later to 

« v j > I ^Bool^ € s > s T ~ <&™**-Vfo - TWCNT^fpsT^iessT))] 
A (3 i) [Choose_p(s, i) = T ]), 

which gets simplified to 

((V i) [Choosers, i) = T => i € s = T J A (3 i) [Choose_p(s, i) - T D, 

because 'NT^i € s) = T and W^T) = T' are derivable. 

Figure 4.14 is yet another implementation of union using the nondeterministic 



172 



operation Choose which signals on the empty set The version is similar to the version 
given in Figure 4.11 except that Choose is nondeterministic. It can also be verified using 
the properties in Th(Set-Int). 



Figure 4.14. Procedure Union - IV 

union = proc(sl, s2 : Sct-lnt) returns (Set-Int) 
i:lnt 

rl :Sct-lnt:=sl 
r2:Sct-Int:=s2 

{rl»slAi2»s2} 
while* trie do 

{ (Size(rl) = »s F A Choose_p(rl, i) & T A IN(Remove(rl, i), IhsciKr2, i), si, s2)) 
V (SiaKrl) = s T A R r2 unk)B < s1 ' «*> ) } 
i : = Sct-lnt$Chooso(rl) 

{ lN(Rcmovc(rl, iy, Inscrt(r2, r), si, s2) } 
rl : = Sct-lut$Rcmove(rl, i) 
r2 : = Sct-Int$Inscrt(r2, 
{IN(ri,r2,sl,s2)} 
end except when no-element: 

end 

{R union(il.s2) } 

return (i2) 

(R> 

end union 



IN(rl , r2, si, s2) = ((V j) [ (Has(sl, # V Has<s2, j)) ; « (Has(rl, j) V Hfs^r^ j)) *s T } A 
(Sizc(rl) + Si/.c(r2)) < (Sizefsl) + Size(s2)) a T A Size(r2) > s t) 



I/O Specification for union 

T =* R, where R - Rl A R2, and 

Rl = (V i) ( (Hasfsl, i) V Has(s2, i)> «=» Has(union(sl, s2), i) all 
R2 = Sizc(union(sl, s2)) < S»zc(sl) + Sizc(s2) a T 



-173- 



4.4.4 Properties of a Specification 

We can prove theorems analogous to Theorems 4.10 and 4.11 for specifications 
specifying nondeterministic operations and exceptional behavior, demonstrating the 
soundness of the axioms capturing general properties of data types. 

The definition of sufficient completeness property has to be modified 
significantly, because there is no meaningful definition of the equational subtheory for 
such specifications. Because of the semantics of S as defined in Section 3.2, it does not help 
to consider only the formulas involving deterministic operations and the auxiliary functions 
corresponding to nondeterministic operation symbols. Recall that for a behaviorally 
complete specification, for every input X to a nondeterministic operation, the 
corresponding auxiliary function is required to hold for at least one ()t, ze), where ze is a 
possible result returned by a on X, and the axioms do not precisely specify the values on 
which the auxiliary function holds. This incompleteness is because the semantics of S does 
not constrain an operation specified to be nondeterministic to have any fixed amount of 
nondeterminism (see Section 3.2). 

A plausible modification to the definition of sufficient completeness is to require 
it to use the whole machinery of first order predicate calculus for deduction. Instead of 
requiring a theorem to be in EQ(S), we require it to be in Th(S). In addition, the definition 
of sufficient completeness given in Subsection 4.3.6 must also be modified to deal with the 
case when a legal ground term e involves nondeterministic operation symbols. For e of 
type D' € A, if *N? D <e) = T G Th(S), it cannot usually be proved equivalent to a ground 
term of type D' having no operation symbol of D, as in case of 
Choose(Insert(Insert(Null, 1), 2)) for example. Instead we must prove that there exists a set 
of ground terms { e , . . . , e k } of type D' not having any operation symbol of D such that 
(3z 1 ,...,zJ[cA(^=e 1 V^se 2 V ... Ws^l 

where c is the condition on z l z generated due to e when we apply the procedure TR, 

and d is the term obtained from e by substituting z y . . . , z m for the subterms having 
nondeterministic operation symbols as their outermost operation. { e v . . . , e k } consists of 
all possible outcomes of e. (Since it is assumed that *N?jy(e) = T' € Th(S), z v ...,z m are of 
a single type instead of a union type.) For example, in case of 



174- 



Choose(Inserl(Insert(Null, 1), 2)), we can show that 
(3 i) [ Choose_p(Iiisert(Iiisert(NuII, 1), 2^1) m T A (i ■ 1 V i m 2 ) J 

We have not investigated the relationship between the above definition of 
sufficient completeness and the behavioral completeness property for such specifications. 
We conjecture that most of the results (Theorems 4.12, and 4.13 in particular) of 
Subsection 4.3.6, when appropriately modified, would hold for such specifications also. 

The definition of well defmedness given in Subsection 4.2.6 directly extends to 
this case also. The definition of completeness, like the definition of sufficient 
completeness, must require in this case that for any two legal ground terms e l and e 2 of the 
same type, \ = e 2 ' € Th(S) if and only if e and e 2 are observably equivalent The 
definition 4.8 of well defmedness given in Subsection 4.2.6 is valid in this case also. 



175 



4.5 Strong Equivalence of Specifications 

In Subsection 3.2.6, we defined the equivalence on specifications; the definition 
required two equivalent specifications to have the same semantics. As discussed in 
Subsection 4.2.6, two equivalent specifications can be different in what properties of a data 
type (a set of data types) can be deduced from them. Below, we define a stronger 
equivalence relation on specifications, which not only requires that the two specifications 
have the same semantics, but also that the same properties can be deduced from the 
specifications. 

Def. 4.10 Two specifications Sj and S 2 are strongly equivalent if and only if assuming that 
for every type used in Sj and S y we use the same theory, 
(i) Sj and S, are equivalent, i.e., QfS^ = D(S 2 ), and 

If S x (or S^) specifies a nondeterministic operation a, we assume that L(D) includes the 
corresponding auxiliary function <t_ji in place of a. 



176 



5. Correctness of Implementation 

One of the main purposes of designing a specification of a data type is to have a 
standard that can be used to verify whether an alleged implementation of the data type is 
correct. In this chapter, we propose a correctness criterion for an implementation of a data 
type with respect to its specification, and discuss a method embodying the proposed 
correctness criterion. In this process, we also exhibit how the theory of a data type 
discussed in the previous chapter is used. 

An implementation of a data type D is concerned with how to realize the behavior 
of D, in contrast to its specification where the main concern is to precisely state its behavior. 
Intuitively speaking, our correctness criterion is that a correct implementation with respect 
to a specification must have the same observable behavior as prescribed by the 
specification. 

Our approach for proving correctness of an implementation is similar to that of 
Hoare [37], Zilles [76] and Guttag et al. [29], and is radically different from the ADJ group's 
approach [23]. We separate the correctness method from the semantics of the host 
programming language in which an implementation is coded. We do not wish to concern 
ourselves with the issue of semantics of the control structures in the programming 
language, so we assume that the semantics of the procedures implementing the operations 
of D is already derived from their code. In contrast, the ADJ group does not seem to 
separate the correctness method from the semantics of the host programming language. It 
seems to be incorporating the semantics of the control structures used in implementing the 
operations into the correctness method, for instance, see their definition of deriver, which is 
a morphism from the specification algebra to the implementation algebra [23]. This makes 
its approach complex and restrictive. 

An implementation uses data types abstractly; it does not refer to any particular 
implementation of a data type used in it. A recursive implementation of a data type D is an 
exception because a reference to D in the recursive implementation is interpreted as the 
reference to the implementation itself. We discuss recursive implementations later in the 
chapter; until then, we assume that an implementation of a data type does not use the data 



177 



type itself. For the time being, we also rule out mutually recursive implementations of a 
collection of (recursive or non-recursive) data types in which an implementation I of a data 
type D uses a data type D' and an implementation i of D' uses D. We discuss mutually 
recursive implementations later with recursive implementations. 

While deriving the semantics of the procedures implementing &e operations of D 
in an implementation I, we do not use the semantics of any particular implementation of a 
data type D' used in I. We instead use the theory constructed from the specification S' of 
D', abstracting from all correct implementations of D' with respect to S', The proof of 
correctness of an implementation of D thus (toes not depend on any property specific of a 
particular implementation of D'. It remains valid, even when an implementation of I>' is 
modified or replaced, as long as the new implementation of D' is correct with respect to the 
sped fication of D\ This separation of the proof of use from flte proof of implementation 
hierarchically structures the correctness proof, reducing the complexity of the verification 
process (37]. 

In the first section, we discuss the correctness cr^rion and pcesent an overview of 
different steps in the correctness method In &e second section, we discuss the 
implementation structure and the semantics of an implementation. In the third section, we 
describe in detail the method for proving correctness of: an impleoientation with respect to 
a specification. In the fourth section* we discuss extenswOs 40 &e proposed method for 
proving con-ectiie^ of recursive and mutually reeufsiveimplcmentatioiis. 



178 



5.1 Correctness Criterion and Overview of Correctness Method 

As discussed in Chapter 3, a specification S in general specifies a set 0(S) of 
related data types, because the behavior of some of die operations is intentionally left 
unspecified on certain inputs. In an implementation, the behavior of the procedures 
implementing these operations must be defined on all inputs in then- domains, because an 
implementation in most programming languages realizes a single data type. The designer 
of an implementation must pick one data type from the set D@5) of data types. 

If a specification specifies preconditions for the operations, the designer of an 
implementation has the freedom to decide what the procedure implementing such an 
operation should do on an input not satisfying its precondition. This is because in defining 
the semantics of a specification, it is assumed to be the user's responsibility to ensure that 
the input to the procedure satisfies the specified precondition. If a precondition is specified 
for constructor, the procedure implementing the constructor could either signal or return a 
value of the defined type. Howeven the value returned must be eonstructible by a 
procedure implementing a constructor using inputs satisfying its precondition (see 
discussion on p. 89. for the elaboration of this assumption). If a precondition is specified 
for an observer, the procedure implementing the observer could return a value of its range 
type, or signal. For example, the operations Pop and Roilace of Slk-Int are specified to 
have ~ (Emptv(s) sty' as the precondition. An implementation of Stk-lnt could have, for 
example, the procedure implementing the constructor Pop either signal on an empty stack 
or return an arbitrary stack. 

For an operation specified to optionally signal exceptions, if the input to the 
procedure implementing the operation satisfies the associated condition, the designer has a 
choice between signalling the specified exception and returning a normal result that 
satisfies the axioms. For example, if optional exceptions are used to specify the size 
requirement on the values of a data type, as in case of Stk-Int, an implementation must 
decide the maximum size of the values. The procedure implementing the constructor Pusfc 



1. Wc arc not considering pafametcrizcdimplcmentations. 



-179 



in an implementation of Stk-litt could either signal overflow or return a stack constructed 
by pushing the integer argument on the stack argument. 

If a specification specifies nondeterminfetie operations, the requirement that an 
implementation of a nondeterministic operation must have maximum amount of 
nondeterminism specified by the specification is too strong. (In case of the specification of 
Set-Int given in Figure 3.1, such a requirement would mean that the procedure 
implementing the operation Choose must be able to nondetermmisticatly pick any element 
of the set.) It is more appropriate to leave it to the designer of an implementation to decide 
how much nondeterminism a procedure implementing a nondeterministic operation should 
have: The procedure when viewed on 'abstract' values of the data type could be either 
deterministic, returning a fixed result out of the many possible chorees specified by the 
specification for an input, or it could exhibit limited nondeterminism or maximum amoiint 
of nondeterminism specified by S, returning a subset oftite set of possible results specified. 
For example, a correct implementation with respect to the specification of SeHnt can have 
the procedure implementing the operation Choose return the maximum integer in the set, 
say, or it could have the procedure hondeteirnmisticafly pick between the minimum and 
maximum integers in the set, etc. As is discussed later, a deterministic procedure can also 
simulate nondeterministic behavior on 'abstract' values by returning different values on 
different values of the rep representing the same 'abstract* value of D. We call such a 
procedure pseudo-nondetertriinistk. 

5.1.1 Semantics of aa Implementation 

By a procedure, henceforth, we mean a procedure in an implementation I of D 
implementing an operation of D, unless stated otherwise; by a constructor procedure and 
an observer procedure, we mean a procedure implementing a constructor and a procedure 
implementing an observer, respectively. We use the name of an operation of D in S written 
in capital letters, as the name of the procedure implementing the operation in I. Outside I, 
we use an operation name instead of the name of the procedure implementing the 
operation to signify that the data type is being used abstractly. 

As data types are used abstractly in an implementation, the semantics of an 



-180- 



implementation I is a set of implementation algebras. These algebras can be constructed 
hierarchically as in Chapter 2; we use in the construction, the implementation algebras 
serving as the semantics of the implementations of the data types being used in I. Like a 
type algebra, an implementation algebra has a domain corresponding to every defining 
type D' e A, which is defined by an implementation algebra of an implementation I' of D'. 

. The domain corresponding to D is in general a subset of a domain corresponding 
to the rep defined by an implementation algebra of an implementation 1^ of the rep. It 
consists of the values of the rep used to represent the values of D. The subset is 
characterized by a formula Inv(r) with exactly one free variable r of the rep type. The 
formula Inv(/) represents the strongest unary relation on the values of the rep preserved by 
the constructor procedures in I. It captures the minimality property of the implementation, 
namely that a value of the rep that represents a value of D can be constructed by finitely 
many applications of the constructor procedures and that these values constitute the 
smallest subset closed under the constructor procedures. 

Let F*(I) stand for the semantics of I. This set can he defined inductively. We 
assume that a set of primitive data types supported by the host programming language are 
implemented correctly with respect to their specifications by its compiler. The semantics 
of the specifications of such primitive types serves as the basis step of the inductive 
definition. If one wishes to prove the correctness of $e implementation of a primitive 
type, the primitive type of the language in which the compiler is coded would then serve as 
the basis. 

In the inductive step, an implementation algebra A in F*(I) has the following 
structure: 

A=[{Vi}U{V D ,ID'€A},EXV;{ria€Q}]. 
Vp = {v| veV* Alnv(v)}, where Vj^ is defined by an implementation algebra in 
F*(I ) for an implementation I of the rep. For each D' € A, V^. is defined by an algebra 
in f*(I D ') for an implementation I D , of D'. The specification of the procedure 
implementing is an abstract specification of i . 

In the next section, we discuss how to construct fty) after the discussion about 
the implementation structure and about Inv(/). 



181 



5.1.2 Correctness Method 

If we consider specifications not specifying any nondeterministie operations, then 
the correctness criterion is simple: ^(I) £ F(S), So, to prove the correctness of an 
implementation I, we need to show that every implementation algebra in F<0 is also in 
/=(S), which can be done using the method discussed in Section 3.2 to show whether a type 
algebra is in F{S). Two main steps of this method are: 

(i) Construct the observable equivalence relation on V^ , as discussed in Sections 2.2 and 
2.3, using the observable equivalence relation on V u - corresponding to each defining type 
D' € A and the observable equivalence relation on H- 4 and 

(ii) interpret the axioms and restrictions in the algebra, and show that they are satisfied. 

Since the set of observable equivalence relations is a congruence, the observable 
equivalence relations must be preserved by the procedures. Trie observable equivalence 
relation is the largest such congruence on the algebra. 

The above discussion is the formal basis Of the correctness method proposed by 
Guttag et al. [29] and Kapur [40]: The observable equivalence relation on the domain 
corresponding to D is Guttag et al/s equality interpretation. The above method in fact 
extends the methods in [29J and [40] because it can handle procedures signalling exceptions 
as well as nondeterministie procedures implementing deterministic operations. 

Note that if there exists a correct implementation I of S, then S is consistent, 
because then F(S) is not empty. This is the basis of Guttag and Homing's statement [28J 
that one way of showing consistency of S is to design a correct implementation I ofS. 



2. A nondeterministie procedure can implement a deterministic operation if all possible results of the 
procedure on every input arc observably equivalent 



-182- 



5.1.2.1 Nondeterminism 

For a specification S specifying nondeterministic operations, the criterion that 
F(I) c F(S) is too strong as it rules out implementations with pseudo-nondeterministic 
procedures which ought to be correct In such an implementation, a nondeterministic 
operation is implemented either as a determmistic procedure or as a nondeterministic 
procedure that does not preserve what should be the observable equivalence relation on the 
values of the rep. It returns different values when applied on different rep values 
representing the same 'abstract' value of D, but every value returned is a possible result 
specified by Son the input; nondeterministic behavior of an operation is realized in this 
way. If we take the largest equivalence relation on the rep values that is preserved by die 
procedures as the interpretation of = in the implementation (which is so in case of 
specifications not specifying nondeterministic operations), the axioms and restrictions in S 
may not hold for such an implementation. However if an equivalence relation preserved 
only by the procedures implementing deterministic operations is taken as the observable 
equivalence relation, then the axioms and restrictions.hold in S. „ 

Consider for example, the implementation of Set-Int in a CLUMike language 
given in Figure 5.1. The procedure CHOOSE is deterministic and returns the first element 
of the sequence value used to represent the set argument The largest equivalence relation 
on the sequences preserved by all the procedures is the, identity relation, and Jt can be 
shown that the axioms of the specification of Sct-lnt do not hold if the identity relation is 
taken as the observable equivalence relation. However if we take the relation 
Eqv(sl, s2) = ( SI$Size(sl) s SISSfce(s2) ) A (v i) | IN(sl, i) « IN(s2, i) ], where 
IN(s, i) = (3 j){ 1 < j <SI$Size<s) A SI$Fetch(s,j)3 U 
and SI stands for the data type Sequence of Integers, as the observable equivalence relation, 
then the axioms hold. The procedure CHOOSE returns 1, for example, on the sequence 
AiWh(Adclli(New, 1), 2) and 2 on Addli(Addli(New, 2), 1), so CHOOSE behaves differently 
on members of the same equivalence class of sequences representing the same set value. 
CHOOSE is an example of a pseudo-nondeterministic procedure. 

To fufly illustrate the correctness method, we discuss two variations Of the 
implementation in Figure 5.1 differing in the implementations of Choose. In the first 



-183 



Figure 5. 1 . An Implementation of Set-Int 

SET-INT = cluster is NULL, INSKRT, REMOVE, HAS, SIZE CHOOSE 

rep = SEQUENCE-INT 

NULL = proc() returns (cvt) 
return (rep$NewO) 
end NULL 

INSERT = proc(s: cvt, i: Int) returns (cvt) 

if INI)EX(s, i) < rcp$Size(s) then return (s) end 
return (rcp$Addh(s, i)) 
end INSERT 

REMOVE = proc(s: cvt, i: Int) returns (cvt) 
j: Int := lNI)EX(s,i) 

if j < rcp$Si/.c(s) then return (rcp$Rcmh(rcp$Rcplace(s, j, rep$Top(s))) ) end 
return (s) 
end REMOVE 

HAS = proc(s: cvt, i: Int) returns (Bool) 

return (lNDEX(s, i) < rep$Size(s)) 
end HAS 

SIZE = proc(s: cvt) returns (Int) 
return (rcp$Size(s)) 
end SIZE 

CHOOSE = proc(s: cvt) returns (Int) signals (no-elcment) 
ifrcp$Sizc(s) = then signal no-clemcnt end 
return (rcp$Bottom(s)) 
end CHOOSE 

INDEX = proc(s: cvt, i: Int) returns (Int) 
c:Int:=l 
while c ^ rcp$Size(s) do 

if rcp$Fctch(s, c) = i then return (c) end 

c:=c+l 
end 

return (c) 
end INDEX 



Choose is implemented as a deterministic procedure CHOOSE' which returns the 
maximum integer in the nonempty sequence; the procedure CHOOSE* is given in 



-184 



Figure 5.2. In the second, Choose is implemented as a nondeterrainistic procedure 
CHOOSE*' which returns the maximum or minimum integer in the nonempty sequence. 
CHOOSE" is given in Figure 5.3. The construct Selai in the code or CHOOSE" behaves 
nondeterministically: Select(Sl, S2, ^ Sn), where Si is a statement, arbitrarily picks one of 
the statements given as its arguments for execution. Note that neither of CHOOSE and 
CHOOSE" is pseudo-nondeterministic. 



Figure 5.2. CHOOSE' 

CHOOSE' = proc(s: cvt) returns (Int) signals (no-clement) 
if rcp$Size(s) = then signal no-clement end 
return (MAX(s)) 
end CHOOSE* 

MAX = proc(s: rep* returns (Int) 
m:= rcp$Bottom(s) 
for i: = 2 to rcp$Sizc(s) do 
if m < rep$Fctch(s, i) then m : = rcp$Fctch(s, i) end 
end 

return (m) 
end MAX 



Figure 5 J. CHOOSE" 

CHOOSE" = proc(s: cvt) returns (Int) signals (no-element) 
if rep$Sizc(s) = then signal no-clcmcnt end 
Sclcct(rctum (MAX(s)), rcuirn (MIN(s))) 
end CHOOSE' 

MIN = proc(s: rep) returns (Int) 
m:= rcp$Bottom(s) 
for i: = 2 to rcp$Si/.c(s) do 
if m > rcp $Fctc h(s, i) then m : = rcptFctchts, 1) end 
end 

return (m) 
end MIN 



185 



5.1 .2.2 Definition of Correctness 

We can now state the correctness criterion. It has two parts. The first part deals 
with implementations not having pseudo-nondeterministic procedures, and-' the second part 
takes care of pseudo-nondeterministic procedures. In the second part, the equivalence 
relation used on the rep is not required to be preserved by the procedures implementing 
nondeterministic operations thus allowing them to be pseudo-nondeterministic; the 
equivalence relation is only required to be preserved by the procedures implementing 
deterministic operations, 

Def. 5.1 An implementation I is correct with respect to a specification S if and only if 
assuming that every data type D' used in I has a correct implementation V with respect to its 
specifications', 

(i) F*(I) C *S), or 

(ii) for every algebra A € F*(l), there is a set of equivalence relations, 
E = { E D , | D' € A U {D} } U E EXV , such that 

(a) for every defining type D' € A, E D< is the equivalence relation on V^, used to prove 
correctness of the implementation Ijy of D\ and sm^Hariy, E rep is the equHarencd relation 
on V used to prove correctness of an implementation JU of the rep, 

(b) Eg^y is the equivalence relation defined as follows: For an exception name ex of 
arity Dj X...X D n , if <v r v;> € E^ ,..., <v n , v>€ E^ , then <ej^¥ 1 ,.,.,v n ), ex(vj,...,i£)> € Ej^y , 

(d) E is preserved by the functions corresponding to deterministic operations in A, and 
(e)A/E€*S). I 

A/E is the quotient algebra of A induced by E except that E need not be a congruence; the 
function V g in A/E corresponding to f in A that does not preserve E behaves 
nondeterministically. The formal characterization above is complex because an 
implementation of a defining type or the rep could also have pseudo-nondeterministic 
procedures. 

In the correctness method, we do not explicitly construct the set f'(I) of 
implementation algebras defined by I. We reason about the set as a whole by not using any 



186 



property specific to any particular implementation of D' € A or of the rep, and by instead 
using the procedure specifications and the theories of the defining types and the rep. We 
show that the axioms and restrictions of S hold when interpreted in I by deriving them 
from the procedure specifications. 

Roughly speaking, the following steps need to be carried out to show correctness 
of an implementation: 

(i) Derive the specification of eveiy procedure in the implementation as a function On 
rep values from its code. 

(ii) Design a formula Inv(/) characterizing the subset of the rep values needed to 
represent the values of D. Jt must express the strongest unary gelation preserved by the 
constructor procedures. 

(iii) Design the equivalence relation on the values of the rep satisfying lav. The 
equivalence relation must be preserved by the procedures implementing the deterministic 
operations. 

(iv) Interpret the restrictions and axioms using the procedures in place of the operations, 
Replace for a variable of type D, a variable of the rep type; satisfying lav. Interpret & 
corresponding to p as the equivalence relation of step (iii)« 

We discuss each of these steps in detail in the next two sections: The second section 
discusses the first two steps; the remaining steps and tfitf correctness method are illustrated 
in the third section/ We argue mata Ibrmula weaker than In** often suffices; furthermore, 
the equivalence relation needed in step (iv) is also often weaker than the strongest 
equivalence relation preserved by the procedures implementing the deterministic 
procedures. We also discuss what extra steps need to be performed if auxiliary functions 
are used in a specification. 

For recursive and mutually recursive implementations, there isan additional step 
in the correctness proof. We need to show that the rep (reps in case of mutually recursive 
implementations) defined by a recursive domain equations) is nonempty, The rest of the 
proof is the same as in case of nonrecursive implementations. 



- 187- 



5.2 Implementation Structure and Semantics 

Besides the procedures implementing the operations of D, an implementation I of 
D may include helping procedures needed in writing the procedures implementing the 
operations. For example, INDEX is a helping procedure in the implementation of Set-Int 
given in Figure 5.1. A helping procedure is not available outside the implementation, so 
we call it an internal procedure of 1. Let I stand for the set of all internal procedures used 
in I. The procedures in I may also use types other than the rep and the defining types of D, 
if need be; we call such types internal types of I and denote the set of internal types in I as 
1 . Note that the internal procedures and internal types of an implementation I are 
different from the auxiliary functions and auxiliary types used in its specification S. 

In this thesis, we do not wish to be concerned about the semantics of the control 
structures used in coding the procedures. There are at least two approaches to avoid 
considering the control structures, which are discussed below. However, we illustrate the 
correctness method using only the translational approach. We have worked the correctness 
proofs using the other approach; the proofs in that case are similar in flavor to the proofs 
using the translational approach. These proofs are not presented in the thesis. We believe 
that the correctness method would work using any approach for specifying the procedures. 

Most programming languages supporting user defined data types provide a 
mechanism that encapsulates a collection of procedures implementing the operations of a 
data type and provides an abstract view of data outside the mechanism, for example, 
cluster in CLU, form in ALPHARD, etc. The encapsulation mechanism constrains the use 
of the procedures. We discuss below the properties desired of an encapsulation mechanism 
that facilitate the correctness proof of an implementation. Finally, we discuss how we get 
the semantics of an implementation I as a set f=*(I) of implementation algebras to complete 
the formal aspects of the correctness method. 



188- 



5.2.1 Procedures - Approach I 

In Chapter 4, we discussed a method based on Floyd-Hoare approach for 
specifying a procedure. In this method, a procedure is specified as a set of formulas 
relating its input to the results) returned by it. The procedures implementing the 
operations in an implementation I can be specified in this way; the specifications of 
internal procedures are not included if they are not referred in the specifications of the 
procedures implementing the operations. A procedure is specified as a transformation on 
the values of the rep. To verify the correctness of a procedure with respect to its 
specification, the theories of the defining types, the rep, and the internal types are used. 

Figure 5.4 is the specification of the procedures in the implementation of Set-Int 
given in Figure 5.1 using this method. It also has specifications of CHOOSE' and 
CHOOSE". Instead of using the procedure invocation itself to stand for the result (or a 
possible result in case of a nondeterministic procedure), we have introduced, for 
convenience, a name for the result For example, the specification of the procedure 
REMOVE uses r to stand for the result of REMOVE on inputs s and i. The specification 
captures that 

(i) if the integer argument i is in the sequence argument s, men r is the sequence obtained 
by first replacing the first occurrence of i in s by the topmost element in the sequence and 
then getting rid of the topmost dement; otherwise, 

(li) r is s itself. In deriving these specifications, we have used the specification of the data 
type Sequence-Int given in Appendix IV. 

5.2.2 Procedures - Approach II 

We translate a procedure implemented in a rich imperative programming 
language to a simple applicative language similar to the specification language proposed in 
Chapter 3 using the method suggested by McCarthy [56] (see [54] where the method is well 
explained). Use the translated procedures to prove the correctness of the implementation I. 
Guttag et al. [29] and Kapur [40] take this approach; they use a language supporting 
conditional expressions, composition, recursion, and the use of auxiliary functions. 



-189 



Figure 5.4. Specification of the Procedures in the Implementation of Setfnt Using Approach I 

NULLO:(=r) 
r=rep$NewO 

INSERT(s,i):( = r) 
( ln'(s, i) => r = s ) A ( ~ ln'{s, i) => r = rep$Addh(s, r) ) 

REMOVE(s,i):(=r) 
(3 j) [ i & s[j] A (V j') [ j' < j => ~ i s s[j'] ] A 

r = rep$Remh(rep$Replace(s, j, rep$Top(s))) ] V (~ ln'(s, i) => r = s) 

HAS(s,i):{ = b) 
(b= T) «* ln'(s, i) 

SIZE(s):( = J) 
i = rep$Size(s) 

CHOOSE(s):( = i) 

rep$Sfze<s) = => CHOOSE(s) signals no-elementO 
rep$Size(s) > => i = s[1] 

CHOOSERS) :(s!) 
rep$Size(s) = =* CHOOSE'(s) signals no elementO 
rep$Sfze(s) > => ( ln{s, i) A (V j) [ 1 < j < rep$Size(s) => s[i] < i ] ) 

CHOOSE"(s): ( = i) 

rep$Size(s) = =^ CHOOSE(s) signals no-etementO 
rep$Size(s) > => ( ln(s, i) A ((V j) [ 1 < j < rep$Size(s) => s[j] ^ i ] 
V (V j) [ 1 < j < rep$Stze<3) *» I < sfjj J), 

where ln(s, i) = (3 j) [ 1 < j < rep$Size(s) A s [J] s I] 

ln'<s,i) s (3 j)[ 1.£ j £ rep$Size(s) AsfJ] slAlVi'H j'<j =* ~is sfj']]] 



We use an extended applicative language that has a signal primitive and guarded 
expressions in addition to composition and recursion mechanisms, and the use of auxiliary 
functions, so that the procedures signalling exceptions and exhibiting nondeterministic 
behavior can be specified. Conditional expressions can be simulated using guarded 
expressions. The translation method proposed by McCarthy can be extended to deal with 
the exception handling mechanism and the nondeterministic construct in a programming 
language. 

An expression is similar to a term; it uses procedure names implementing the 
operations, internal procedure names, the auxiliary procedure names introduced during the 



190- 



translation, and terms. 

The signal primitive takes arbitrarily many (nonzero) arguments; its first 
argument is an exception name, and other arguments are expressions^ of various types. Its 
syntax is signaKex, e^ . . . , e ), where ex is an exception name with arity D t * ... x D n 
and each e. is an expression of type D.. 

A guarded expression is similar to Dijkstra's guarded commands; its syntax is 

<guarded expression> ::= <cxpression> | <alternative> ( | <alternative> ] 
<alternative> ::= <condition> =» <guarded expression> 
<condition> ::= <boolean expression^ 

where [ X ] stands for zero or finitely many repetitions, and the symbol *||' stands for 
nondeterministic choice among various alteajgives. If a g^r^ed expression is simply an 
expression, then its semantics is that of an expression. Otherwise, if a guarded expression is 
a collection of alternatives, then for an instance of its variables, its semantics is the 
semantics of the guarded expression of an arj>ifoi% clibsen alternative whose boolean 
condition is T. If every alternative has its condition as F, then the semantics of the guarded 
expression is undefined. A guarded expression exhibits nondetermmistk behavior because 
for an instance of the variables, there are in general many alternatives whose condition is T, 
and one such alternative is arbitrarily chosen. 

We translate the procedures in the implementation of Setlnt in Figure 5.1 to the 
above applicative language. Figure 5.5 is their translation; we have also included the 
translation of the procedures CHOOSE' and CHOOSE" as well as of the internal 
procedures MAX and MIN. In translating the internal procedure INDEX, the auxiliary 
function f is introduced to simulate the effect of the while loop used in INDEX. Similarly, 



3. An alternate approach to introducing guarded expressions for specifying the nondeterministic behavior of 
a procedure OP is to specify its non-exceptional behavior using a deterministic boolean auxiliary function 
OP_P, similar to the function o_p corresponding to a nondctciiiitnistic operation a as discussed in the 
previous chapter. For an input on which the nondeterministic procedure returns a normal value, the 
corresponding auxiliary function holds for all possible values returned by the procedure on that input and 
docs not hold for other values. 'Ilien die procedures can be specified using conditional expressions and 
recursion. We have adopted the above approach for specifying the procedures, because it is direct and 
simple. 



- 191- 



the auxiliary procedures f and f" are introduced to simulate the/orloop in MAX and MIN 
respectively. 

Cartwright and McCarthy's first order semantics of recursive programs [8] can be 
used to prove properties about (he procedures written in the above applicative language. 
The recursive definition of a procedure is cpnsideited as an ax fem defining the function 
computed by the procedure. Because of the nonde^ewninistic behavior of a guarded 
expression, we have to be careful in using such> an. axiom, or we will run into 
inconsistencies. For a particular instantiation of vaiwfefe^ ja, Ihe axiom, we use every 
possible alternative whose condition is T, and we»ckMMittrelMe any tw&alter natives whose 
conditions are T. For example, for OrOOSE", there are two alternatives, MAX(s) and 
MIN(s), for the case (~ rcp$Size(s) = 0). We do not equate MAX(s) to MIN(s), as relating 
them can cause inconsistency. The termination of a procedures is proved separately either 
using the method suggested by Cartwright and McCartriy, or the method based on well 
founded ordering [14]. 

The translationai approach! js purely ijjase|$on, the semantics of the control 
structures of the host programminf'language m'ttHH£$ff%e primitives of the applicative 
language incorporated into the translation metWbdf ll^pl^pielties of the types iivolved in 
the implementation can be used in simplifying the resulting translations. 

5.2.3 Properties of the Encapsulation Mechanism 

As was stated earlier, in most of the prograniniiffg languages supporting user 
defined data types, an implementation of a data tgpg4S an encapsulation of the procedures 
implementing the operations that disciplines their use. ; 5ueh ; AP implementation is 
protected: A procedure implementing an opei^ip^.off^yCaj^.be passed any arbitrary 
vame of the rep as a [representation of a value of; p; Jraflft^^oYiry a value of the rep 
constructed earlier as a representation for a value of P ? b^the T constructor, procedures of J) 
can be passed. Every value of the* rep weed riot; in geiteraft>€Nlse£ 10 represent a value of D. 
The procedures are invoked only on those values of the rep which can be constructed by 
finitely many applications of the constructor procedures of D. (For example, the procedure 
REMOVE in the implementation of Set-Int in Figure 5.1 is never passed a sequence having 



192 



Figure 5.5. Translation of the Procedures in the Implementation of Set-Int 

NULL ^ rep$NewO 

INSERT(s,i) ^ INDEX(s, i) < rep$Size(s) => s | 

(~ INDEX(s, < rep$Size(»)) => rtfp$Addh(s, i) 

REMOVE(s.i) ^ INDEX(s, i) < rep$Slze(») 



£ 



rep$Remh(rep$Replace(s, INDEX(s, i), rep$Top(s))) | 
(~ INDEX(s, I) ^ rep$Stz«(«)) *> • 



HAS(s, i) « tNDEX(s, < rep$Sizefs) 

SIZE(s, i) £ rep$Slz«(») 

CHOOSE(s) » rep$Size(s) * O =» signatin^efonent) f 
<~ rep$Size(s) = 0) ^ rep$Bottom(s) 

INDEX(s, £ f(s,i, 1) 

CHOOSE'(s) s rep$Slze(s) = **■ *ignaKno>elem#nt) | 
<~ rep$Size(s) = 0) => MAXU) 

MAX(s) £ f'(s, rep$Bottom(s), 2) 

CHOOSE"(s) = rep$Size(s) = => signaKrto-element) | 
(** rep$Sfee(s) = 0> => MAXfe) | 
(~ rep$Siza(») = 0) =» lAtNU) 

MIN(s) A f "(s, rep$Bottom(8), 2) 



Auxiliary Functions 

f : rep X Int X Int -4 Int 
f ' : rep X Int X Int -» Int 
f ' : rep X Int X Int — Int 

f(s, i, c) ^ (~ c < rop$5iz«(s)) *» c | 

(c < rep$$iz*(s)) A (rep$Fetch(s, c) s I) => c | 

fc < rep$Sfee(s) A * {rep$Fetchts, tf * fr^ ffs, I, c + 1) 

Hs/m.c) — <**©< rep$f$fzefs))=> m § 

((c < rep$Siz«{s)) A (f«.< ropf FetqMs, cM) s* f '(s, rep$Fetch(s, ©), c + 1 ) | 
((c < rep$Size(s)) A (~ m < rep$Fetch(s, em =*■ f (s, m, c + 1) 

f"(s, m,c) = (~c< rep$Srze(s)) => m | 

((c <; repfcSiw***}) A (m > rep$Fetcb{s, c))) «» f"(s, rep$Fetch(s, c), c + 1) | 
(Cc < rep$Size(s)) A (~ m > rep$Fetch($, c)» ** f"(s, m, c + 1) 



193 



multiple occurrences of an integer, as such a sequence cannot bAe®j|$tr4icted using NULL, 
INSERT and REMOVE.) We are interested in the behavior of the procedures only on this 
subset of the values of the rep. The subset is characterized by the formula Inv(r) discussed 
in the previous section, which expresses the strongest unary relation on the values of the 
rep preserved by the constructor procedures of D, Inv(>) is expressed without alluding to 
any particular implementation of the rep type. 

Def. 5.2 A procedure OP implementing a constructor a • : D x X . . .X0^ D preserves Inv 

ifandonlyif 
whenever ((V 1 < i < n) [D. = D => Invjxl J ), then 
(i) if OP(jf r . . . , x ) returns a normal value, InvJOPC^, . . . , x$, otherwise, 
(ii) if OPCx,, ..., x n ) signals ex(e v .."., e), therfor each £bftybelXfnvf6). 

If OP is nondeterministic, all possible results returned by OP must satisfy Inv. I 

For the implementation of Set-Int given in Figure 5.1, Inv(s) is 
(V i,j)1(l < U < repSSize(s) Al * tf**s[if* Ml 
where sfi] is an abbreviation for rep$Fetch(s, % It can Verified that Iifv(s) is preserved by 
the constructor procedures of Set-Int. Figure 5.6 is a proof that REMOVE preserves' 1hv(s), 
the most difficult among the three cases. Any predicate stronger than the one above is not 
preserved by the constructor procedures. 

Inv may be difficult to deduce from a complex implementation, but the designer 
of an implementation usually has a good idea atat what Inv is. \h the correctness ptbof, 
Inv is usually not necessary; a weaker property may suffice. In case Inv is available, a 
property of the representing values needed in the correctness proof can be deduced directly 
from Ihv. Otherwise, if Inv is hot available, then the property can be deduced by checking 
whether the property is preserved by the constructor procedures^ since Inv is the strongest 
unary relation preserved by the constructor procedures, any unary relation preserved by 
the constructor procedures is implied by Inv. 

If a module implementing an abstract data type in a programming language is not 
protected, as would be the case if abstract data types are simulated in PASCAL or PL/I, 
say, then 



194 



Figure 5.6. Proof of REMOVE Preserving Inv 

Assume lnv(s) holds. To show that lnv(REMOVE(s, i)) holds. 

If type name is not included in die operation names bctow, wc assume that die operation arc of type rep. 

There are two cases. 

Case 1: INDEX(s, i) £ Size(s) 

Sizc(s) > < = = > T, from the specification of INDEX 

lnv(RF.MOVE(s, i)) ~ lnv(Rcmh(Rcplacc(s, INDEX(s, j).Top(s)^ from the Specification of REMOVE 
It can be shown using the specification of INDEX and the theory of Sequence- Ini that 

<i) ( Inv(s) A 0< k < Sizcfs) A s' s Rcplace<s.k,# ) ■•* . 

(( (V kl) [ 1 < kl < Size(s) A ~ k = kl 1 =» slkl] s sfkl] ) A s(k] s j ) 

(ii) (Si/c(s) > A s' = Rcmh(s)) => (V k) [ (1 < k < Sizeis*)) =*• (s[k] = sfk] A Sizc(s') = Sizc(s) -1)J 

Using (i) and (ii), we have lnv(RKMOVE(s, i)) «• T 

Case 2: ~ INDEXfs, r) < Ste(s) 
Inv(Ri:MOVWs,i)> « lnv(sX from the specification of REMOVE 
~T 



(i) restrictions must be imposed on the global variables, if any, as well as on the use of the 
procedures implementing the operations to ensure the minimality property of the 
implementation, and 

00 Inv must be preserved wherever a procedure implementing an operation is invoked. 
Such a proof is likely to be global and complex. (Guttag [31] discusses restrictions on the 
Euclid implementation module to ensure that the moduje satisfy the minimality, property.) 
In the following discussion, we assume that the semantics of a mechanism encapsulating 
the procedures implementing the operations of a data type ensures the minimality 
property. 

It is not necessary for the procerJures to terminate over their entire input domain 
if Inv(r) is other than T. To prove total correctness of an implementation, it is sufficient 
that a procedure implementing an operation a that hasjts i-th argument x to be of type D 
terminates whenever Inv[x] holds. 



195- 



5.2.4 Semantics of an Implementation 

Now that we have the procedure specifications, we can construct the 
implementation algebras of I using them. Since procedures specifications may use internal 
types and internal and auxiliary procedures, we first construct the extended 
implementation algebras and then derive the implementation algebras from them. For 
every possible implementation I' of a type D' used in the implementation I, we have the set 
of its implementation algebras. In an implementation algebra of I, the domain 
corresponding to D' is the domain defined by an implementation algebra of I'. An 
extended implementation algebra A 1 of I has the following structure: 

A^liv/^ulv^lD'eAui^.EXViiijaefiuip}]. 4 

V,l={v|v€V Alnv(v)}. The function i is the interpretation of the specification of 
L) ' ' rep o 

the procedure corresponding to a in A 1 . From A 1 , we get an implementation algebra A 
A = [{VilU{V D ,|D'€A},EXV;{i a |a€QH 



4. In addition to the internal procedures, I is assumed to include the auxiliary procedures needed in the 
translation of the procedures into the applicative language discussed above. 



196- 



5.3 Correctness Method 

We describe the remaining steps of the correctness method outlined in 
Subsection 5.1.2. For completeness, we repeat the steps discussed in the previous section 
about the termination Of the procedures and the preservation of the formula Inv. For a 
specification specifying nondeterministic operations, we discuss the method for three cases: 
An implementation of a nondeterministic operation is (i) a deterministic procedure, (ii) a 
nondeterministic procedure, and (iii) a pseudo-nondeterministic procedure. We first use 
the implementation of Sct-Int given in Figure 5.1 with CHOOSE replaced by CHOOSE* 
for illustrating the method for the deterministic case. Later, we use CHOOSE" as the 
implementation of Choose to illustrate the method for the nondeterministic case, and 
finally , we use CHOOSE to illustrate the method for the pseuck^nortdeterministic case. 

5.3.1 Auxiliary Functions in a Specification 

If a specification S uses auxiliary functions and auxiliary types, we extend an 
implementation I to include the implementations of the auxiliary functions in the 
correctness proof. We include in the specifications of the procedure of I, the specifications 
of the implementations of the auxiliary functions. For showing the correctness of I, we use 
the extended implementation, instead of I in the following steps; an auxiliary functions is 
treated like an operation. In the following discussion, whenever we say I, we mean the 
extended implementation if S uses auxiliary functions. 

5.3.2 Preservation of Inv 

If the formula Inv(s), which characterizes the subset of values of the rep used to 
represent the values of D, is available, verify that Inv(/) is preserved by every constructor 
procedure. We showed in the previous section that for the implementation of Set-Int in 
Figure 5.1, its Inv is preserved by every constructor procedure. 

If Inv(r) is not available and cannot be guessed easily, we temporarily assume that 
every value of the rep is being used to represent the values of D. In the derivation of the 
axioms and restriction of S from the procedure specifications, in case we need any property 



197 



P(r) of the rep values, we deduce P(r) by showing that P(r) is preserved by the constructor 
procedures of D, as in that case Inv(r) would imply P(r). 

In the derivation of an axiom or a restriction in S from the procedure 
specifications, a variable of type D is instantiated to a value of the rep satisfying In?(r) (or 
P(r) if Inv(r) is not available). 

5.3.3 Termination of Procedures 

Prove that every procedure in I is total on the arguments it can expect, i.e., if an 
argument to a procedure is of type D, prove that the procedure terminates if these 
arguments are values of the rep satisfying Inv(r). 

5.3.4 Proving Restrictions and Axioms 

Show that every restriction in S specifying the exceptional behavior and every 
axiom in S specifying the normal behavior can be derived from the specifications of 
procedures in I. The operation symbols and the auxiliary function symbols in the axioms 
and restrictions are replaced by the names of procedures impler^enting them. The theories 
derived from the specifications of the defining types, the rejaw and internal types can be 
used in the derivations. 

The symbol = in S is interpreted as the observable equivalence relation. = n is 
usually interpreted as the largest equivalence relation on the values of the rep satisfying Inv, 
preserved by the procedures. The exception is the case when a nondeterministic operation 
is implemented as a Dseudo-nondeterministic procedu^ 

relation serving as the interpretation of s^ is required to be preserved only by the 
procedures implementing deterministic operations, and it need not be the largest such 
equivalence relation. 



-198 



5.3.4.1 Preservation of Equivalence Relation 

A deterministic procedure OP implementing an operation e : D x...x D g -» D' 
preserves an equivalence relation on the rep values, expressed as a first order formula 
Eqv(s r 5 2 ), where s } and s 2 are of rep type, and are the only free variables in the formula, if 
and only if for each l<i<n, ( [ D. = D =» Eqv(x, y) ] A [ D. ± D =» x = y. ] ), either 

(i) *OP(x r .., xj signals ext^ holds and 'OPO^.^ y^ signals ext 2 holds such that 
*ext i = ext 2 is provable. In addition to the rules discussed in the previous chapter, we 
have: For an exception name ex of arity Dj x.„x D ' , if for every D! = D, Eq?(jr!, yp, and 
for every D: * D, x". = /, then ex(x' x')& ex(y' y') is provable. Or, 

111 1 III -:J Q) 

(ii) If D' = D, then 'Eqv(OP(x 1 x ), OPtV,„.,y fr is provable, and if D' * D then 

*OP(jf lt .. M xj = D , OPfy >- n )' is provable. 

If OP is nondeterministic then (ii) above is modified to be: If D' = D, then for every 
possible result r } returned by OPO^,..., xj, OP(y Jt ..., y ) can return r 2 such that Eqv(r r rj is 
provable, and vice versa, and if D' £ D, for every ^returiieihjy OP(x v .,xJ t QP(y v ...,yJ 
can return r 2 such that *r l = D . r 2 is provable and vice versa. 

For example, Eqv(sl, s2) for the implementation of Sct-Int in Figure 5.1 with 
CHOOSE replacing CHOOS3E is 
(SI$Size(sl) s SI$Size(s2)) A (v i) [ IN(sl, i) = IN(s2, i) J. where 
HN(s,i) = (3j)| 1 < j < SI$Size(s) AsOt ".I j. 
It relates sequences that are permutations of each other. Eqv is preserved by every 
procedure implementing an operation of S&-ifi^ Figure 5.7 has J the proofs for me 
procedures INSERT and HAS. Eqv(s r s 2 ) is me largest equivalence relation preserved by 
the procedures. Any equivalence relation stronger than Eqv would have to relate sequences 
that are not permutations, and is thus hot preserved bylHAS. ' 



199 



Figure 5.7. Proofs that INSERT and HAS Preserve Eqv 

For INSERT 

assume Eqv(sl, s2), to show that (V i) Eqv(INSERT(sl, i), !NSERT(s2, i)) 

Case 1 : INDEX(sl, i> < SI$Sirc(sl) a T 
Using Eqv(sl, s2), wc have INDF.X(s2, i) < SI$Size(s2) a T, so 
INSERTlsl. i>a si, 1NSFRT<82, ft as2, so Eqv(INSfiRTfcsU), INSERTS, i)) < 

Case 2: INDKX(sl, i) < SI$Sizcfc2) a F 
Using Eqv(sl, s2), wc have INDEX(S2, i) < Sl$Size(s2) sF, so 
INSERT(sl, i) ~ Addh(sl, i), 1NSERT(s2, i) a Addh{s2, 0, so 
Hq\<INSKRT(sl, i), INSKR 1(82, •)) «* Kqv(Addh<sl, ifc, Addh(s2, i)) « T 

For HAS 

From the semantics of INDEX, we have 
(i)INDEX(s,i)>0 = T, 

(ii) INDKX(s, i) < SI$Sizc(s) ==> s [INDRX(s, i)J s i, 
(iii) INDKXfci) > SIlS«e(s) »> { (V jMl < j < SI$Ske(s»*!* ~ s \ftm A] 

assume Eqv(sl,s2), to show (V i) HA$(sl. i) .a HAS(s2,i> 
HAS(sl, i) = INI)fiX(sl, i) < SI$Size(sl) 

Case I: INDEX(sl, i) <. SI$Sizc(sl) a T 
si flNDF.X<sl, i» = i 

Using Eqv(sl, s2). we get (3 j) [ (1 < j < SI$Size(s2)) A s2 [j] = i \ so 
INDEX(s2, i) < Sr$Size(s2)a T 
HAS(sl,i>aHAS(s2,i)aT 

Case 2 INDEX(sl, i) < SI$Sizc(sl) a F 
Using Eqv(sl, s2) and the above facts about INDEX, we get 
lNDEX(s2, i> <s St$Sizc(s2)a F, so 
HAS(sl, i) a HAS(s2, i) a F 



5.3.4.2 Restrictions 

For a restriction specifying a required exception condition of a, 
Rj(JO => e(X) signals ext 
show that whenever P a (X) and Rj(*) interpreted in I hofd, the procedure OP 
implementing a must signal exi. For example, the specification of Set-Int specifies the 
following required exception condition for Choose in its restrictions component: 

#(s) = => Choose(x) signals no-clcmentO- 
So the procedure CHOOSE' must signal no-element() when SIZE(s) = 



200 



(«=» SI$SIZE(s) = 0) holds, which is indeed so (the precondition specified for Choose is T). 

For a restriction associating an optional exception condition with a, 
o(X) signals ext => OAX), 
show that whenever the procedure OP implementing a signals ext, PJX) and Oi(X) 
interpreted in I hold. Forexastptei the specification of Stk-Int given in Figure 3.2 specifies 
the following optional exception condition for the operation Push: 

Push(s, i) signals overflows, i) => #(s) > 100. 
In an implementation of Stk-Iirt, if the procedure implementing Push signals overflow, then . 
the size of the input slack must be > 100. 

We must also show that (i) if an input to a procedure OP implementing an 
operation a satisfies its precondition, does not satisfy $he condition for any of its required 
exceptions or optional exceptions, then the procedure termiftates^nortrtafly; Let 

C(A0 = ( PiV) A ( ~ R.(A) A „. A « R m) A H 0.{Jr) A «. A « O JJT» ), 
where for l<i</, R. is the condition when a is required to signal exl , and for l<i<m, O. is 
the condition when a has the option to signal an exception exl. We show that C( A) imphes 
TC nomul (^), where TC nonnal (*) is the weaf est input cbncJitron for GP to terminate nornialjy. 
For example, for every procedure in the implementations of SeHnt, 4he above condition is 
satisfied. 

If a nontrivial precondition P ff is specified for a constructor a, then the procedure 
OP implementing a either signals on input X not satisfying P ff , or returns a rep value 
which can be constructed by a constructor procedure using an input satisfying its 
precondition. For example, a correct implementation of Stk-Int can have the procedure 
implementing Pop return a stack when applied on an empty stack. If the procedure 
implementing Push signals overflow on a stack of size 128, say, then the procedure 
implementing Pop can only return any stack of size <£ 128. It cannot return a stack of size 
1000, say ; allowing it to do so would be meaningless. 



201 



5.3.4.3 Axioms 

In the derivation of an axiom, we ensure that (i) for every occurrence of a 
procedure name OP implementing the operation a, the input to OP must satisfy the 
precondition P associated with a, and (ii) no subexpression signals any exception. 

If an axiom is an equation of the form 'e ~ e^ we prove that its interpretation in 
I is derivable. If e L and e 2 are of type D, = is interpreted as Eqv; otherwise, the 
interpretation of el = e2 in I can be derived using the theories constructed from the 
specifications of the rep, the defining types, and internal types. 

If an axiom is of the form 'e = if b then e 2 ,' we have to prove that 'b => e x = e 2 ' 
when interpreted in I is derivable. Similarly, for ao^iorn _'e » if 6 then e 2 else e^ we must 
prove that 'b => e = e 2 ' and '~ b => e = e } ' are derivable in I. Recall that the condition b is 
assumed to behave deten»iriisticaMy even when it inirolves nottdeterministk operation 
symbols, Figure 5.8 is a proof that the then part of the axiom* 
Remove(lnsert(s, \\\, i2) m if il * i2 Then Re*<m<s, i2) else J nsert( Removes, i2), il), 
is derivable. The derivation of the else clause, 

<~ il = i2) =» Remove(liisert(s, il), i2) s l«sert(Remove(s, S% rt), 
uses a property of the representing values that 
(Vi)[(rep$SM^>OAliKs,i»^ai^rr^i^i«|»$SMs>Asy|milJ, 



Figure 5.8. Proof that an Axiom of Set-hit is Derivable 

il = i2 => Rcmovc(inscrt(s, il), i2) s Rcmove(s, i2) 

Assume il = i2, to show Eqv(REMOVE(INSERT(s, il), i2), REMOVES, i2)) 

Case 1: INDEX(s, il) < rcp$Si/.c(s) s T 
INSERT(s, i I) =s s, so the above holds. 

Case 2: INDEX(s, il) < rcp$Si/e(s) s F 
Ijci r = INSERTS il) = Addh(s, il) 
Using il = i2, INDEX(r, i2) = rcp$Sizc(Addh(s, il)), so 
RKMOVE(r, i2) s s, and 
REMOVE(s, i2) = s, so the above hokte. 



202- 



which is preserved by the constructor procedures. 

The axiom Chooses) € s = T under the condition '- Size(s) = 0/ when 
interpreted in I is HASCCHOOSEXs), s) = T.' This is derivable, because 
iNDEX(MAX(s), s) < rep$Size(s) = T is derivable. The remaining axioms in the 
specification of Set-Int can also be shown to be derivable. 

The above five steps constitute the correctness method. If an implementation I 
can go through the above steps, it is correct with respect to S. For example, the 
implementation of Set-Int given in Figure 5.1 with CHOOSE replaced by CHOOSE* goes 
through the above steps, and is thus correct 

5.3.5 Nondeterministic Procedures 

We now consider the case when an implememation has a irondeterministic 
procedure implementing an operation specified to be hondeterminfstic by Si We have 
already discussed the conditions &r a nondetcmwiistic procedure to preserve In* and the 
equivalence relation Eqv. Various. steps in the coir jetness proof discussed above remain 
the same except that if ah axiom involves thenondeterministic procedure, we'niust nserme 
interpretation of formulas involving nondeterministic function symbols discussed m 
Chapter 4. In addition* it must lie ensured that for any inputs the nondeterministic 
procedure does not have a choice of signalling as well as terminating normally. 

For example, if we consider the implementation of Set-Int in Figure 5.1 with 
CHOOSE replaced by CHOOSE*, most of ine-above proof remains valid. We havelo 
show that the axiom Choose(s) C s wT. is (terivabte under the condition '~ Size(s) %?.$',. 
That is, if 'rep$SUe(s) > 0' holds, then 

HAS(s, CHOOSERS)) *T ^ », r 

is derivable. CI IOOSE"(s) can either return MAX(s) or MIN(s) For both possibilities, (*) 
is derivable, as 

INDEX(MAX(s),s)<rep$Sizc(s) = T 



5. (3 ! j) stands for 'there exists a unique j such that' 



203 



is derivable from the specifications of MAX and INDEX, and 

INDEX(MIN(s), s) £ rep$Size(s) s T 
is derivable from the specifications of MIN and INDEX. Nole that CHOOSE" preserves 
the equivalence relation Egt. 

The implementation of Set-Int in Figure 5.1 with CHOOSE replaced by 
CHOOSE" is also correct 

5.3.6 Pseudo-Nondeterministic Procedures 

A pseudo-nondeterministic procedure (which could be either deterministic or 
nondeterministic) is not required to preserve the equivalence relation Eqv. The 
correctness proof in this case also is carried as above depending on whether the procedure 
is deterministic or nondeterministic. However, we must ensure that if the procedure 
terminates normally for any input X, then it must do so for all input equivalent to X, and if 
it signals on an input X, then it must signal equivalent exceptions for all input equivalent to 
X. This ensures that a pseudo-nondeterministic procedure does not have a choice of 
signalling as well as terminating normally on equivalent rep values. 

We now take the implementation of Set-Int in Figure 5.1. CHOOSE is 
deterministic; it returns the bottom element of the nonempty sequence. Eqv is not 
preserved by CHOOSE. If the axiom Choose(s) € s = T is derivable under the condition 
that 'Size(s) * 0,' then this implementation is also correct. The proof of the axiom is 
straightforward : 1 f 'rep$Size(s) > 0' holds, then 
HAS(s, CHOOSER)) sT« HAS(s, Bottom(s)) ~ T 

When an implementation does not have any pseudo-nondeterministic procedures, 
then the interpretation of s in I is the largest equivalence relation preserved by the 
procedures. However, a weaker equivalence relation preserved by the procedures may 
suffice to show that the restrictions and axioms of S hold in I. 



6. For example, a procedure CHOOSE™ which nondctcrministically picks between the top (last) and the 
bottom (first) clement of the sequence is nondeterministic and docs not preserve the equivalence relation Eqv. 
So, CHOOSE"' is also pseudo-nondeterministic. 



204 



Though the designer of an implementation usually has an idea of what the 
observable equivalence relation is, sometimes it may not be known. In that case, we will 
not know what procedures are pseudo-nondeterministic. Then, we choose an equivalence 
relation preserved by the procedures implementing the deterministic operations, and using 
it as the interpretation of s, we attempt to show* #t#t every axiom as interpreted in I is 
derivable. If successful, the implementation I is correct; otherwise, ^ stronger equivalence 
relation is chosen and the above process is repeated. If the correctness of I cannot be 
established even when the strongest equivalence relation preserved by the procedures 
implementing the deterministic operations is chosen, then I is incorrect 

Another way to view the above correctness method is to consider the specification 
of the procedures in an implementation I as axioms of the theory of I, defining the 
functions computed by the procedures, and show that every nonlogical axiom of Th(S) is in 
the theory of I. The theory of I also includes the theories of the types used in I. Nakajima 
et al [62] take a similar view. 



205- 



5.4 Recursive and Mutually Recursive Implementations 

Dcf. 5.3 An implementation I of D depends on a data type D' iff only if 

(i) D' is used in I , or 

(ii) a data type D" used in I depends on D\ ■ 

In Def. 5.3 above, it is assumed that data types other than D are abstractly used in 
an implementation I of D. In the correctness method discussed in the previous two 
sections, we have assumed that 
(i) an implementation I of D does not depend on D, and 
(ii) an implementation of a data type D' used in I does not depend on IX 
We relax these constraints. We call an implementation I of it fecurstve'Vt and only if the 
rep used in I depends on D. We call an implementation I of D and another 
implementation I' of D' mutually recursive if and only if I depends on D' and V depends on 
D. We assume that recursion is not due to internal types used in I. It should be ridfed that 
if the implementations of a set of data types are mutually reettrpvie, that does not mean that 
data types are also mutually recursive (mutually recursive data types are discussed in 
Section 2.4). We first discuss how the method proposed in Secttefi M besmodirled to deal 
with recursive implementation, later we consider mutually recursive implementation, 

5.4.1 Recursive Implementations 

In proving correctness of a recursive implementation, we consider a reference to 

Figure 5.9. An Uninteresting Recursive Implementation of D 

D = cluster is OPj.OPj,. 

rep = D 
OPj = proc(...) returns.... 

return (DSOPj (...)) 
end OP, 



206 



D in I as a reference to its rep and an invocation of an operation a of D as a call to the 
procedure OP implementing a. The equate defining the rep inside I is considered as a 
recursive domain equation, as the construction of tfte rep depends on D itself. For 



Figure 5.10. Implementation of List-Int 

LIST-INT = cluster is NIL, CONS, CAR, CDR. IS.IN, ISJ3HPTY 

rep = oncof [empty: Null, pair: Pair} 
Pair = struct [ car: lnt, cdr: List-Int] 

NIL = proc() returns (cvt) 

return (rcp$make_empty(nil)) 
end NIL 

CONS = proc(i: lnt, 1: List-Int) returns (cyt) 

return (rcp$makc_pair(Parrj{cari, cdr:l})) 
end CONS 

CAR = proc(l: cvt) returns (lnt) signals (empty) 
U^casel 
tag pair (p: Pair): return (pxar) 

tag empty: signal eraptyO 
end 
end CAR 

CDR = proc(l: cvt) returns (List-Int) signals (empty) 
tagcasel 

tag pair (p: Pair): return (pxdr) 

tag empty: signal emptyO 

end 
end CDR 

ISJN = proc(i: lnt, 1: cvt) returns (Bool) 
tagcasel 
tag pair(p: pair): if p.car = i then return (true) 

else return (List-lnt$is_in(i, p.cdr)) end 
tag empty: return (false) 
end 
end ISJN 

IS_EMFiT = proc(l: cvt) returns (Bool) 
return (rcp$is_cmpty(l)) 
endIS_EMPTY 



207- 



example, consider the implementation of a data type list &f Integers, denoted by List-Int, 
given in Figure 5.10; its rep is a recursive domain equation. A recursive domain equation 
can be solved by defining an ordering on type algebras and using Kleene's Recursion 
Theorem. The rep is the least fixed point solution of the equation (see [3] fpr details about 
such an ordering). 

For a correct implementation I, the type algebras ? of the rep should have a 
nonempty principal domain. This property is trivially ensured if ref) is nonrecursive. For 
some recursive implementation such as the one given in Figure Suft^ithe least fixed point is 
the empty algebra, an algebra having no domain and no functions. For well founded rep 
equates such as in case of List-Int, the algebras are nonempty. )f the rep can be. proved to 
be nonempty, the method proposed in the previous secti©n;can4*e used. The proof tfeat life 
least fixed point of a domain equation defining the rep is nonempty is the only additional 
step in proving the correctness of a recursive implementation. ,, 

Figure 5.11 has specifications of the procedures in the implementation pf Ll^fnt, 
(The specifications of Null, Struct [n^D^ . ... ^« k : DJ, a»£ One-of [nf D^v;.; , « k : D^l are 
given in Appendix IV.) Figure 5.12 is a specification of List-Int. We gtye^^ailtStidh 
of various steps in the correctness proof of the implementation of List-Int given in 
Figure 5.10. 

Figure 5.11. Translation of the Procedures of List-Int 

rep = oneof [ empty: Null, pair: Pair] 
Pair = struct [car: Int, cdr: List-Int] 

NILO = rep$make_empty(nil) 

CONSO, I) ^ rep$make_pair(Pair${car: i, cdr: f}) 

CARfl) — rep$is_pair(l) =** Pair$get_car(rep**alue_pair(ttt R 
~ rep$is_pair(l) => signal(empty) 

CDR(I) = rep$is_pair(l) => Pair$get_cdr(rep$value_palr(l))| 
*» rep$is_pair(l) '**> signal(empty) 

IS_IN(i,l) = rep$is_pair(l) => (i = Pair$get_car(rep$vahje_paJKD) V 
IS_IN{i, Pair$get_cdr(rep$value_pair{l))) | ,;,.. 
~ rep$is_pairO) =*■ false 

IS_EMPTY{I) = rep$is_empty(l) 



-208- 



Figure 5.12. Specification of List-lnt 

Operations 

Nil : -» List-lnt 

Cons : Int X List-lnt -» List-lnt 

Car : List-lnt -♦ Int 

-» empty () 

Cdr : List-lnt -> List-lnt 

-♦ empty () 

Is Jn : Int X List-lnt -* Bool 
ts-Empty: List-lnt -» Beol 

Restrictions 

Is-Empty (I) => Card) signals empty ( ) 
Is-Empty (I) =* Cdr(l) signals empty <) 

Axionis 

CartConsO, I)) a I 

C4r(Cw«(i, 0) a I 

ls-ln (i, Nil) a F 

tarlrjUt , Cons(i2, 0) a if "11 * 12 then T else ts-ln (11 , 1) 

l*-Empty(NH) = T 

ls-Empty(Con*(i, I)) a F 



(i) the least fixed point of the recursive domain equation is nonempty. For any model of 
Int, the approximations to the rep can be constructed 

(ii)Inv(s)isT. 

(iii) The termination of procedures other than IS JN is obvious, assuming that the 
tagcase, and the operations of one-of terminate. For ISJty we can prove termination 
using McCarthy and Cartwright's appap^.c^^^ajsitlie.sliKt-,!!^ the rep is well 
founded with respect to the ordering, K cmestf [pair: {ear: i, cdr: /| for any i and L 

(iv) the equivalence relation on the rep is the identity retation. 

(v) The procedures return normally on an input oa which the restriction component does 
not specify the corresponding operation to signal 

(vi) Every restriction is derivable. 

(vii) Every axiom is derivable. 



209- 



5.4.2 Mutually Recursive Implementations 

We prove the correctness of mutually recursive implementations in a way similar 
as in case of a recursive implementation. The correctness of mutually recursive 
implementations must be proved together. The reps of the two implementations are 
specified as mutually recursive domain equations; the solution of $ese equations are, the 
least fixed pointy which serve .as the, rep of D and the; repofRV For the implementations I 
and I' to be correct, both reps must be nonempty. Ther-rest of the proof is same as in case of 
nonrecursive implementations with the exception that the correctness proof for all mutually 
recursive implementations is done together. The implementations 4 and I' have to be 
shown to satisfy the restrictions and axioms in S and S'. The invocation of an operation of 
D' in 1 is considered as a call to the procedure 8rrr*iMpferBe«rihg tie ©pecaSOnrahdthe 
invocation of an operation of D in I' is considered as a call to the procedure in I 
implementing the operation. 

The correctness proof cannot be hierarchicaify structured in case Of mutually 
recursive implementations, because their correctness Was to be proved together. For this 
reason, we do not recommend that hierarchically structured Xttbhrecursive) data types be 
implemented mutually recursively. However, for a set of mtinjaliy recursive data types, 
their implementations have to be proved correct together; so these data type can be 
implemented mutually recursively without adding to i me complexity of the correctness 
proof. 



-210- 



6. Conclusions 

We have presented a rigorous framework for abstract data types, and studied four 
important aspects of abstract data types, namely definition, specification, theory, and 
implementation correctness, within this framework. An overview of the approach taken in 
studying these issues is given in Chapter 1. The framework has provided a base from 
which to to ask many interesting and important questions about data types. Some of these 
questions have been answered in the thesis, while others suggest directions for further 
research. Below, we first summarize the contribtrttons of our work and then indicate areas 
where further work is required. 

6.1 Summary of Contributions 

We have made a clear distinction between a data type and its specification^) in 
our research. The behavioral approach for defining a data type developed in the thesis 
embodies the view of a data type taken in programming languages. It considers only the 
input-output behavior of the operations. It abstracts from the representational structure of 
the values and the operations of a data type as weU;as from multiple representations of 
values for a particular representational structure. Our jlefinitional method can handle data 
types with nandeterministic operations and wi$ operations, exhibiting exceptional 
behavior. It is independent of specification methods for data types. Specification 
languages other than the one proposed in the thesis can also be developed based on it It 
can be used to give the semantics of existing specification languages. In [43], we have 
studied and compared the expressive power of various specification languages for data 
types. Using the definitional method, we have been able to characterize computability over 
the values of a data type, and study the expressive power of the operation set of different 
designs of a data type [42]. 

The specification language proposed in the thesis is structured and flexible. The 
normal behavior and the exceptional behavior of the operations are specified separately. 
The language provides mechanisms to specify (i) nondeterministic operations, (ii) 
preconditions for operations stating what portion of the input domain of an operation is 



211 



interesting, (iii) exceptions which must be signalled JJy the operations, and (iv) exceptions 
which the operations can optionally signal. In designing the specification language, one of 
the goals has been to facilitate writing specifications as well as proving properties of data 
types from their specifications without having to express the properties that can be 
deduced. The semantics of a specification is given as a set of data types. Equivalence 
among specifications is defined. 

We have proposed a deductive system for abstract data types and studied its 
different components. A first order theory of a data type is defined, which is constructed 
from its specification using the deductive system. The well definedness, sufficient 
completeness and completeness properties of a specification are defined based on what can 
be deduced from it. These properties are related to the model theoretic properties of a 
specification. A clear distinction is made between the model theoretic and proof theoretic 
properties of a specification. 

We propose a correctness criterion for an implementation of a data type with 
respect to its specification, independent of implementation correctness methods and 
specification methods. Many implementation correctness methods can be developed 
embodying this criterion. We develop a correctness method which is simple and natural 
for a wide class of specifications. 

Throughout this research, we have emphasized modularity and hierarchical 
structure, be it the definition, specification, deductive system, or implementation of a data 
type. 

The development of the framework has also provided useful insights into data 
type behavior and the programming language features, such as the advantage of having a 
protected encapsulation mechanism for implementing a data type, separation of the 
exception handlers from the type behavior, significance of hierarchical structure and 
modularity, etc 



212- 



6.2 Directions for Further Research 

We first discuss topics of further research emerging from the discussion in various 
chapters. We later discuss other aspects of data type behavior not studied in the thesis, and 
finally, the topics in which the assumptions made about data type behavior in the thesis are 
relaxed. 

We have not investigated how easily the deductive system proposed in Chapter 4 
can be automated or incorporated into an existing automatic data type deduction system 
such as AFFIRM. We do not anticipate any major problems in incorporating the 
subsystem for reasoning about the exceptional behavior of a data type, because the axioms 
describing the exceptional behavior are similar to equations and can be transformed to 
rewrite rules. However, the subsystem for reasoning about nondeterministic operations 
involves axioms using existential quantifiers. A verification system based on first order 
predicate calculus can in principle incorporate this subsystem. We feel that the full power 
of first order predicate calculus with its complexity is not required.. An approach for 
untransformed axioms (in which properties are expressed using nondeterministic symbols) 
similar to rewrite rules for equational axioms needs to be investigated. 

The implementation correctness method discussed in Chapter 5 uses an 
equivalence relation on the values of the rep (representing type) and requires that the 
implementation be extended to include the definitions of auxiliary functions used in a 
specification, if any. It would be useful to develop a method that can derive this 
information from the specification and the implementation. We do not anticipate any 
problems in automating the remaining steps of the method; however, the interface between 
a verification system embodying proof rules for control structures and a data type 
deduction system may need to be analyzed. We are investigating another method that does 
not require the equivalence relation and the definitions of auxiliary functions for an 
implementation. It is based on the behavioral equivalence relation on models: For every 
computation having an observer as its outermost operation, if the specification prescribes a 
result, a value returned by the computation when interpreted in the implementation must 
be one of the possible results prescribed by the specification. 

The proposed implementation correctness method tells whether an 



213- 



implementation is correct with respect to a specification. It would be interesting to extend 
it so that the bug(s) in a incorrect implementation can be located: this would help in 
debugging the implementation. 

Another complimentary area for further study is that of systematic testing for 
enhancing confidence in a piece of software. In addition to using it for testing programs 
using the data type, a specification of a data type can neusetftb design a set of test cases for 
checking the implementations of the data type. Gannon et tit. [19] discuss a system in 
which a specification of a data type as a set of conditional equations is presented along with, 
a set of test cases which can be executed using the implementation to test for the 
consistency of the implementation with the specification. A methodology for designing an 
'adequate* set of test cases from a specification would be very useful for such a system. 

Specifications are Oftett hard to write; amJ especially- the writing of an 'algebraic' 
specification has been found to be hard [41, 3]. We are investigating a method for writing a 
specification in a systematic manner; usirig this method, we have been able to write 
specifications of data types such as traversable staek i41];Hle [42], etc. A system that 
embodies such a method and helps a designed m writing a specification would be very 
useful. It should assist the designer in analyzing a sjjecifica^bri so as to enhance his 
confidence in the specification, k should check for general structural properties of a 
specification such as well definedness and completeness, which ensure proper relations 
among different components of the specification. The undechiabihty of completeness and 
well definedness can be shown by reducing them to the f^tt^rrespbrtdence problem [58] 
in Post systems. However, sufficient conditions on axioms and restrictions which guarantee 
well definedness and completeness Of a specification netd to I* investigated. The results of 
Guttag and Horning [2S] and Polajnat ^7] will probably be herpluf in arriving at these 

conditions. 

It is equally important to ensure that a specification indeed captures the intent of 
the designer. This can be checked in several ways, SOWie of which are complimentary: The 
designer can express additional properties that a data type should satisfy. He then attempts 
to prove these properties from its spedficatron using the deductive system. Another 
approach is for the designer to come up with a model of the data type and then check that 



214- 



the axioms and restrictions hold in that model. Third approach can be similar to program 
testing; the specification can be validated on a set of test cases. 

Guttag and Horning [32] have suggested how formal specifications can be used as 
a tool for designing software. Our specification language can be used to aid the design of 
the data component of software. For it to be used for writing specifications of general 
software, it must be extended to include mechanisms for specifying mutable behavior, 
procedural abstractions, other control abstractions, etc. 

An important aspect of data types not studied in our framework is the 
relationships among different data types. One important relationship is among the set of 
data types defined by a type scheme (also called a parameterized type). Data types in the 
set defined by a type scheme have similar behavior except that the values of these data 
types may have their constituents belonging to dift^ent types, and the values may have 
different structural constraints, for example, different upper l»unds on the size of the 
values, etc. This variation in the behavior of different types is expressed using two kinds of 
parameters: Constant parameters ranging oyer the values, of a data type, often used to 
express the structural constraints on the values, such as bounds on Ibe size of the values, 
and type parameters stating the type of the constituent? of the values. For example, a type 
scheme Stk[n : Jut, / : Types] defines, a set of data types that have the behavior of stacks, 
and that differ in the type of the elements of stacks and the Wjer bound on Jbe size of 
stacks. Types stands for the set of ail data types, and is itself not a data type. T^e data type 
Sik-InHOO, for example, is an instance of the above type scheme with n = 100, and 

A type scheme is in general a partial function from tiie cartesian product of the 
domains of its parameters to the set of all types, Tjpes, For a particular set of parameters, 
this function either returns a data type or is undefined. For example, the type scheme Sljc 
is a function from Int xT^jies to Types, and of parameters. 

However, if parameters of a type scheme are required tasaUsfy certain properties, thenthe 
function returns a data type only if the parameters satisfy the desired properties. For 
example, in case of the type scheme Set|/ : TypesJ, its tyjie parameter must have an equal 
operation with standard properties. 



215- 



The specification language proposed in Chapter 3 can be easily extended to 
specify type schemes. A specification should have an additional component, called 
Requires, stating conditions on the parameters ranging over types. The Requires 
component can specify both the operations that th^ type parameter must have and their 
properties. The semantics of such a specification can be easily given. How the deductive 
system proposed in Chapter 4 can be extended to type scherna would need further 
investigation. 

Apart from a type scheme*, there are other interesting relations among different 
data types. There are standard mathematical relations, such as the relation between a 
cartesian product of data types and its components; the relation between discriminated 
unions and its components; etc. Some of these relations can be expressed as type schema. 
The notion of a subtype of a type needs investigation. For example, what relations exist 
between integers, rationals, and algebraic reals? How do sets, multisets, ordered sets, and 
sequences relate, and how do stacks and traversable stacks relate? 

Our framework is limited in three respects. Firstly, the definition of a data type 
only incorporates the input-output behavior of its operations. It does not consider another 
aspect of the operations, namely how efficiently these operations can be performed. It is 
not even clear whether the computational complexity of the operations should be included 
in a definition of a data type, or whether it is an orthogonal constraint on the 
implementations that should be included in a specification. We think that the input-output 
behavior of the operations of a data type should be kept separate from their computational 
complexity, but a specification should have another component stating the performance 
requirements on the implementations of the operations. 

Secondly, we have assumed a simple model of nondeterminism in analyzing the 
input-output behavior of the operations. For an input on which a nondeterministic 
operation can return many possible results, we have not considered how these results are 
scheduled. It would be interesting to incorporate the scheduling information and extend 
the definitions of observable behavior and distinguishability of values. It would also be 
interesting to investigate how our formalism is affected if we relax the assumption that a 
nondeterministic operation cannot have the choice of signalling as well as terminating 



216 



normally on a particular input 

Thirdly, the definitional method handles only immutable data types. As is 
discussed in Appendix I, for a wide class of mutable data types, the states of their objects 
can be modeled as the values of an immutable data type. However* the framework needs to 
be extended to handle arbitrary mutable data types including data types having objects 
whose state is also mutable, for example, the data type list in MA£USP, Tjhe specification 
language and a deductive system based on the extended framework need to be developed. 
Berzins's work [3] can be useful in studying this extension. 



-217 



References 

1. Preliminary ADA Reference Manual and Rationale. SIGPLAN Notices 
Vol. 14 No. 6, June, B79. 

2. Berzins, V. Personal Communication. Lab. for Computer Science, MIT, 
Dec, 1976. 

3. Berzins, V. Abstract Model Specification for Data Abstractions. 
LCS-TR-221, Lab. for Computer Science, MTF, JvlAfWft. 

4. Btfkhoff, C, Lipson, J.D. Heterogeneous Algebras. Journal of 
Combinatorial Theory Vol.8, 1970, pp. 115-133. 

5. Brand, D., Darmger, J.A., Joyher, W.H. C^mpMtenies#bf Conditional 
Reductions. IBM Research Report R^C74M, Vorfctown Hefghfe; New York, 
Dec.;1978. 

6. Burstall, R.M. Proving Properties of Programs by Structural Induction. 
Computer Journal Vol. 12, Feb., 1969, pp. 41^48. 

7. Burstall, R.M., Goguen, J.A. Putting Theories Together To Make 
Specifications. Invited Paper at the Fifth International Joint Conf on Artificial 
Intelligence CambM$t,MA, Aug., 1977. 

8. Cartwright, R, and McCarthy, J. Recursive Programs as Functions in a First 
Order Theory. Report No. STAN-eS-79-717, Stanfbrt tmivercity, 

March, 1979. 

9. Cohen, J. Nondeterministic Algorithms. Computing Surveys Vol. 11 No. 2, 
June, 1979, pp. 79-94. 

10. Cohn, P.M. Universal Algebra. Harper and Row, New York, 1965. 

1 1. Dahl, O.J., Nygaard, K., Myhrhaug, B. The Simula 67 Common Base 
Language. Norwegian (D6rmj)uting dent^r,f ? 6rsknrngsveih IB, Oslo, 1968. 

12. Dijkstra, E. W. Notes on Structured Progttirhnling. In Structured 
Programming (Dahl O.-J., Dijkstra, E.W., Hoare, C.&R$ 1 Acadenifc Press, 
London and New York, 1972, pp. 1-81. 



21* 



J 3. Dijkstra, E.W. A Discipline of Programming. Prentice Hall, Englewood 
Cliffs, NJ, 1976. 

14. Dershowitz, N., and Manna, Z. Proving Termination with Multiset 
Ordering. Comm. ACM Vol. 22 No. 8, Aug., 1979, pp. 465-476. 

15. Ehrig, H., Krecwski, H., PadawhxP. Stepwise Specification and 
Implementation of Abstract Data Types. Proceedings of the Fifth International 
Collq. on Automata, language, and Programming, Ud\m t & Lecture Notes in 
Computer Science Vol. €X Springec-Yerlag, i$7^^^^226. 

16. Enderton, H.B. A Mathematical Introduction to logic. Academic Press, 
New York and London, 1972. 

1 7. Floyd* R. W. Assigning Meanings to |rc$ram& Proceeding of a 
Symposiutn in Applied J4gjk % y 

Science (ed. Schwartz, J.T.), American Mathematical Society, Providence, RA, 
1%7, pp. 19-32. 

18. Friedman, D.P., Wise, D.S. (^NS^^rK^E^a^tejteAi^uments. 
Technical Report No. 44, Computer Science Dept, Indiana University, 
Nov., 1975. 

19. Gannon, J„ McMullin, P., Hamlet, R„ Ardif, M. Testing Traversable 
Stack.'5 , /(7/ > £i4^iVoritoVoL15No.IJinV^/pp^^.'''''' 

20. Goguen, J A. Abstract Ef jors for Abstract Data Types. Proceedings of the 
1FIP Working Conference on Formal Basis of Programming ConceptsWoA.2, 
Aug., 1977, pp. 21.1-21.32. 

21. Goguen, J.A., and Tardo, J.J. An Introduction to OBJ : r A Language for 
Writing and Testing Formal Algebraic Program Specifications. Proceedings 
IEEE Conf.on Spe&ji&ttioiQ o/ReJiaMe Sc//Hwre. Cambridge, MA* 
April, 1979, pp. 170-189. - ^ 

22. Goguen, J.A., Thatcher, J.W.* W*gner, E»Q f , ^f^O^^bsCnKipal* 
Types as Initial Algebras and Correctness of Data Representations. 
Proceedings^ Conference m^omjmer Graphics, P&iern Recognition and Data 
Sm/c/ure,May3?75,pp^#93. -- 



219- 



23. Goguen, J.A., Thatcher, J. W., Wagner, E.G. Initial Algebra Approach to 
the Specification, Correctness, and Implementation of Abstract Data Types. In 
Current Trends in Programming Methodology, Vol. IV, Data Structuring, (ed. 
Yen, R.T.), Prentice Hall, Engfewood Cliffs, NJ* »?8; 

24. Goodenough, J.B. Exception Handling: issues and A Proposed Notation: 
Comm. ACM Vol. 18 No. 12, Dec., 1975, pp. 683-696. 

25. Guttag, J.V. The Specification aftd Application to Programming of 
Abstract Data Types. Ph. D. Thesis, University of Toronto, CSRG-59, 1975. 

26. Guttag, J.V. Abstract DiataTypes arid the Development of Data Stn»ctares. 
Comm. ACM Vol. 20 No. 6, June, 1977, pp. 396-404. 

27. Guttag* J.V., Horowitz, E., Musser, D.R. The Designjof Data Type 
Specification. In Current Trends,in Programming M&hodotom ^ -W* Dai* 
Structuring, (ed. Yeh, R.T.), Prentice Hall, Englewood Cliffs, NJ, 1978. 

28. Guttag, J. V., Homing, J J; ^ TteAlg«*(dKSpecifieati©ri of Abstract Data 
Types. Acta lnformatica Vol. 10 No. 1, 1978, fjp. 27?52, 

29. G uttag, J,V M Horowitz, E, Musser, DiR. Abstract Data Types and 
Software Validation. Comm. ACM Vol 2lNa/12 i iDe^I978rPp i 'M4M064. 

30. Guttag, J.V. Personal Communkatim^MvSi Wit. 

31. Guttag, J.V> Notes on Type khmi^nriEE£Ttw&m Software 
Engineering Vol SE-6 No. 1, Jan., 1980, pp. 13-23. 

32. Guttag, J.V., Horning, J.J. Formal Specification as a Design Tool. 
Proeeedingsof tile Seventh ACM Symposium on Principles of Programming 
Languages, LasVegasi Nevada* Jani* 198& - 

33. Guttag, J.V. Personal Communication, Jan., 1980. 

34. Harel, D„ Pratt, V.R. Comments on Pfeginm Verification. \n Research 
Directions in Software Technology (ed. Wegner, P.), M.I.T. Press;*}ambfidge, 
MA, 1979, pp. 387-391. 



-220- 



35. Hewitt, C. Personal Communication. Lab. for Computer Science, MIT, 
Dec., 1978. 

36. Hoare, C.A.R. Procedures and Parameters: An Axiomatic Approach. In 
Symposium on Semantics of Algorithmic Languages, {ed. Engeler, E) as Lecture 
Notes in Mathematics, No. 188, Springer Verlag, 1971, pp. 102-115. 

37. Hoare,C.A.R. Proof of Correctness of Data Representations. Acta 
Informatica Vol. 1, No. 4, 1972, pp. 271-281. 

38. Hoare, C.A.R. Notes on Data Structuring. In Structured Programming, 
(Dahl, O.-J,, Dijkstra, E.W., Hoare, CAR;), Academic Press, London and New 
York, 1972, pp. 83-174. 

39. Hoare, C.A.R. Recursive Data Structures. Intl. Journal of Computer and 
Information Sciences Vol. 4 No. 2, June, 1975* pp. 105-132. 

40. Kapur, D. Proving Correctness of Implementation of a Data Abstraction 
Using the Algebraic Method. Unpublished Handout, M;1.T. Course 6.891 
Specification 7ecAfl/qwes, Nov., 1975. 

41. Kapur, D. Specifications of Majster's Traversable Stack and Veloso's 
Traversable Stack. S1GPLAN Notices Vol. M No. 5, May, 1979, pp. 46-53. 

42. Kapur, D., Srivas, M.K. Expressiveness of the Operation Set of A Data 
Abstraction. Proceedings of the Seventh ACM Symposium on Principles of 
Programming Languag^ i \j&yeg^N&ymki t im. i lW)^ An expanded version 
appeared as Computation Structures Group Memo 179?I, Lab. for Computer 
Science, MIT, Jan., 1980. 

43. Kapur, D. The Expressive Power of Algebraic Languages for Specifying 
Abstract Data Types. Draft Manuscript, Lab. for Computer Science, MIT, 
June, 1979. 

44. Knuth, D.E., Bendix, P.B. Simple Word Problems in Universal Algebra. 
In Computational Problems m Abstract Algebra (ed. Leech, J.)* Pergamon Press, 
1970, pp. 263-297. 

45. Lampson, B.W., Horning, J.J., London, R.L., Mitchell, J.G., Popek, G.L. 
Report on the Programming Language Euclid. SIGPLAN Notices Vol H 
No. 2, Feb., 1977. 



221- 



46. Levin, R. Program Structures foe ikcepitenal Condition Handling. Ph.D. 
Thesis, Dept of Computer Science, Carnegie-Mellon University; June, 1977. 

47. Liskov, B.H., ZiHes, S.N. Specification Techniques for Data Abstractions. 
IEEE Trans, on Software Eng& Vol. S£-l No. 1, 1975^ pp. 7*19. 

48. Liskov, B.H., Berzins, V. An Appraisal of Program Specifications. 
Computation Structures Group Memo 141*1, Lab. for Computer Science* MIT, 
Jan., 1977. Also in Research Directions mSqjfiware^IWiMofogyipi. Wegnef, 
P.), M.l.T. Press, Cambridge, MA, 1979, pp. 276-301. 

49. Liskov, B.H., Snyder, A., Atkinson, R., Sdiafiert.C, Attraction 
Mechanisms in CLU. Comm. ACM Vol, 20iNa & Aug., 1977, pp. 564-576. 

50. Liskov, B.H., Snyder, A.S. Exception Handling In CLU. IEEE Trans, on 
Software Engg. Vol. S£-5 No. 6, Nov., 1979vpp. 54M57. 

51. Liskov, B.H. Modular Program Construction Using Abstraction. 
Computation Structures Group Memo 184* Jiab. forGomputer Science, MIT, 
Sept, 1979. 

52. Liskov, B.H. etal. CLU Reference Manual. M1T-LCS-TR-225, Lab. for 
Computer Science, MIT, Oct, 1979. 

53. Majster, M.E. Limits of the Algebraic Specification of Abstract Data 
Types. SIGPLANNoticesVb\A2NoAQ,OcL<mi,pp.3'h42 i 

54. Manna, Z. Mathematical Theory of Computation. McGraw Hill, 
Computer Science Series, 1974. 

55. Manna, Z. Six Lectures on the Logic of Computer Programming. Stanford 
A.I. Laboratory AIM-318, Nov., 1978. 

56. McCarthy, J. Towards a Mathematical Science of Computation. 
Proceedings IF IP Congress, 1962, pp. 27-28. 

57. McCarthy, J. A Bask for a Mathematical Theory of Computatfon. m 
Computer Programming and Formal Systems (eds. Braffort and Mirschberg), 
North Holland Publishing Co., Amsterdam?£dn<Joav 19iS3^pp. 33-70. 



222 



58. Minsky, M. Computation: Finite and Infinite Machines. Prentice Hall, 
EngJewood Cliffe, NJ, 1967. 

59. Morris, J.H., Jr. Types Are Not Sets. Proceedings of the First ACM 
Symposium on Principles of Programming Languages, Boston, Oct, 1973, 
pp. 120-124. 

60. Musser, D.R. Abstract Data Types in the AFFIRM System. IEEE Trans. 
on Software £ngg. Vol. SEt6 No. 1, Jan., 1986, pp. 24-3L 

61. Musser, D.R. Proving Inductive Properties of Abstract Data Types. 
Proceedings of the Seventh ACM Symposium on Principles of Programming 
Languages, Las Vegas, Nevada, Ian;, 1980.? 

62. Nakajima R., Nakahira, H., Honda, M. Hierarchical Program 
Specification and Verification -A Many Sorted Logical Approach. Preprint 
RIMS 265, Nov., 1978. 

63. Nourani, F. Constructive Extension and Implementation of Abstract Data 
Types and Algorithms. Ph.D. Thesis, DepL of Computer Science, Urinrershy Of 
California, Los Angeles, June, 1979. 

64. Okrent, H.F. Synthesis of Data Structures from Algebraic Descriptions. 
Ph.D. Thesis, Dept of EE. & C.S., MIT, Feb., 1977. 

65. Palme, J. Protected Program Modules in Simula 67. National Defense 
Research Institute, Stockholm, Sweden, July, 1973. 

66. Parnas, D.L. Information Distribution Aspects of Design Methodology. 
Information Processing 71, Vol. I, North Holland, Amsterdam, 1972, 
pp.339-344. 

67. Polajnar, J. An Algebraic View of Protection and Extendibility in Abstract 
Data Types. PrtDcilliesis, Dept of Comptiter Science, University of Southern 
California, Sept, 78. \ 

68. Srivas, M.K. Preliminary Investigations of a Thesis Topic on Automatic 
Synthesis of Abstract Data Types; Unpublished Manuscript Lab. for 
Computer Science, MIT, Deo, 1978. 



223 



69. Standish, T.A. Data ^ructures - An ^©raatie Apiwoach. Bok,»Bpranek, 
and Newman, Inc., Technical Report 2639, Aug., 1973. 

70. Subrahmanyam, P. On a Finite Axiomatization of the Data Type L. 
SIGPLAN Notices Vol. 13 No. 4, April, 1978; pp. 80-84. ' 

71. Thatcher, J.W., Wagner, EG., Wright, J.W. Data Type Specification: 
Parameterization and the Powermf Specification Techniques, Proceedings of 
the Tenth S1G ACT Conference, May, 1978. Also an IBM Report RC7757, 
July, 1979. 

72. Wegbreit, B., and Spitzen, J.M. Proving Properties of Complex Date 
Structures. J ACM Vol. 23 No. 2, Aprils 1976, pp. 389-396* 

73. Wirth, N. Program Development by Stepwise Refinement Comm ACM 
Vol. 14 No. 4, April, 1971, pp. 221*227. 

74. Wulf, W., London, R.L., and Shaw, M. Abstraction and Verification in 
ALPHARD: Introduction to Language and MeuSocIolbigy! USCinformation 
Sciences Institute Research Report, 1976. 

75. Wulf, W., London, R.L., and Shaw, M. An Introduction to the 
Construction and Verification of Alphard Programs. IEEE Trans, on Software 
Engg. Vol. SE-2 No. 4, Dec., 1*76, pp. 253-265. 

76. Zilles, S.N. Algebraic Specification of Data Types. Project MAC Progress 
Report, 1974, pp. 52-58. Also Computation Structure^ <3 roup Memo 119, Lab. 
for Computer Science, MIT, 1974. 

77. Zilles, S.N. An Introduction to Data Algebra. Draft Working Paper, IBM 
San Jose Research Lab, Sept, 1975. 



224 



Appendix I - Elaboration of Scope and Assumptions 

In this appendix, we elaborate on the scope of the thesis and the assumptions 
made about abstract data types and their operations. 

1 . Immutable and Mutable Data Types 

We adopt the commonly accepted informal view of a data type as a collection of 
objects with a finitcicollection of operations to manipulate these objects. The objects by 
themselves are not meaningful and the operations are the only way to construct, 
manipulate and observe the objects as well as to extrart information stored in them. 

Data types can be classified based om&eirf^ye^t befenyfior./ An object of a data 
type may or may not exhibit time varying behavior. An object exhibiting time varying 
behavior is called a mutable object, whereas an object whose behavior does, not change is 
called an immutable object [49]. We also catt an immutable object a miyt A data type 
having only immutable objects is called an immutable data type; otherwise, a data type is 
called a, mutable data type. A mutable data type may also have immutable objects, but at 
least one of its objects must be mutabte. A mutable object can be factored info two 
components: (i) identity, and (ii) state [41). A mutable data type has at least one operation 
constructing new objects. Its opexatiqnsii^ay change ^ejtateofft mutable object without 
affecting the object identity. At a given point in a computation, there can exist many 
different mutable objects having the same state. For a wide class of mutable data types, the 
state component of the mutable objects can be described as an immutable data type. 

In this thesis, we have considered only immutable data types with a finite set of 
computable operations. We have not considered immutable data types with iterators [49] 
nor data types involving streams and lazy evaluation [18]. 



225- 



2. Exceptional Behavior 

During the design and construction of reliable software, there is often a need to 
have data types with operations exhibiting exceptional behavior. (See [24, 46, 52, 50] for a 
discussion on the need for an exception handling mechanism in a programming language.) 
It is only meaningful to apply such operations on a subset of their domains. If an input 
falls outside the subset, such operations notify their callers indicating that the input is not 
'good,' by signalling exceptions. An exception is assumed tp have two components, a 
descriptive name and a possible set of arguments which carry information from the point 
where the exception is signaled, to its handlers. 

We assume that every operation of a data type terminates on every input in its 
domain: it either terminates normally by returning a value of its rarige type or terminates 
by signalling an exception. We think it is not a gqo4prf)Ctice to design data types having 
operations that do not terrninate on some inputs. Jf a partial funetipnon the values of a 
data type needs to be realized, it can be programmed in terms pf , we operations of the data 
type in a host programming language supporting the data type mechanism. 

The assumption of the operations r; bein§ |Pl#! ?um^lij[j^f the formalism developed 
in the thesis. Our formalism can be exten4ed tp partiaj opej^t^ns,w!thPMt much difficulty 
by introducing a special vaJue 'undefined' fpr every data ty^e such that if a partial 
operation is not defined on an input, then it returns 'undefined pn tfjat input 

3. Nondeterminism 

There are data types some of whose operations exhibit nondeterministic behavior. 
These operations return one of many possible values for a given input. For example, the 
Choose operation of the data type finite set of integers, which returns any element of a 
given nonempty set, is nondeterministic. Similarly, the Index operation of the data type 
finite sequence of elements, which returns a position of a given element in a given sequence, 
is also nondeterministic because the sequence can have more than one occurrence of the 
same element All prior work on data types has assumed the operations to be deterministic. 
We feel that a formalism for data types must be capable of handling data types with 



226 



nondeterministic operations, as nondeterminism is a powerful and elegant abstraction 
mechanism for designing programs [13,9]. Furthermore, allowing nondeterministic 
operations permits the handling of data types with operations implemented in a parallel 
environment 

We assume that a nondeterministic operation has only finitely many choices on a 
particular input We rule out data types having operations with infinitely many choices. 
Such an operation can be used to write programs having unbounded nondeterminism [13]. 
There is a controversy about the the realizabffity of programming constructs having 
unbounded nondeterminism and about the limitation of the expressive power of a language 
that rules out programs with unbounded nondetefminfsm [35]. Using our formalism, it is 
possible to define a data type whose values are Infinite* (e.g., ''infinite* sets, 'infinite' 
sequences, etc.,) insofar as these values can be finitely constructed using the operations; 
but nondeterministic operations on these values that have inff nitefy many choices are ruled 
out. Our formalism would however extend without much difficulty to the case where the 
constraint that a nondeterministic operation has only finitely many choices oh an Input is 
dropped. 

We also assume that if a nondetermfntstic operation signals an exception on an 
input, then the operation behaves detennmisticalry oh the input *$ttvk i a nondeterministic 
operation is not ailowed to have a choice between signalling ; alid terminating normally on 
any particular input This assumption leads td a simpler and modular characterization of 
the observable behavior of the data type than would otherwise be possible. 



227- 



Appendix II - Definitions of Algebraic Concepts and Proofs of 
Theorems in Chapter 2 

In the first section, we extend the definitions of congruence, homomorphism, and 
isomorphism to extended heterogeneous algebras living nonifetewninistic functions. In 
the second section, we present the proof of Theorem 2.2. hv the tftird section; we explain 
how the Definition 2.12 of behavioral equivalence on typ& algebras captures tfre <ie$rfed 
property that a computation (i.e., an interpretation of a grottid wfatt) results in equivalent 
values in two behaviorally equivalent type algebras. 

1. Congruence, Homomorphism, and Isomorphism 
Def. A2.1 A congruence R on a conventional heterogeneous algebra 

a = [{v D ,|D€^}r{i;|ir€a}K 
in which each f. is a total deterministic function, is a family of equivalence relations 
{ Rjy | D' € A' } such that 

for every a € 0, a : Dj X . . . X D n -» D' f 

for all v x € V D , . . . , v n C V D 

1 D 

We also say that R has the substitution property. 

In an extended heterogeneous algebra having nondetermtnrstic functions, when f ff 
is a nondeterministic total function* then (*) is modified to 

a v2€{f a (v;,...,v;)}3>'e{f a (v 1 ,...,v^}|^R^2]). 

If R D - is the identity relation (equality), then the above reduces to 

Congruences on an extended* heterogeneous * algebra- A can also be partiafiy 
ordered in the same way as in case of a conventional heterogeneous algebra: 

Given two congruences E 1 and E 2 , E 2 is larger than E\ expressed as E 1 < E 2 , if and only if 
for each D' e A', Ejj} C Ejj? . 



228 



Congruences form a lattice with respect to <, and have the least element (the identity 
congruence) and the greatest element (the universal congruence). 

Def. A2.2 LetAjandAjbe 

A 1 = [{Vi.|D'€A'};{fJ|»€0}l 

A family of total (deterministic) functions *={*£,; V^.-* \£, \ D'€ A'} is called a 
homeniorphism from Aj to A 2 if 
for each <r :D 1 x...xD ~*D\ 

in 

for each Vj of type Dj (i.e., Vj € Vjj ),..., V: of type D B , 
(i) iff 1 is deterministic, then f 2 is also deterministic and 

4(C<^•••• : •^fe'('>'••"••\^■ 4fl, '''' , ' 

1 n 

(ii) if f^ is nondeterministic, then f* is either nondeterministic or deterministic, and 

(Case (ii) above covers ease (I) also.) We catf * an onto rmmomorphism from A 2 to A 2 if 
every function in * is onto; in that case, A 2 is called a homomorphk image of A 1 . If every 
function in * is a bijection, then * is an isomorphism firom Aj'W A , and Aj and A 2 are 
isomorphic. Note that, if A x and A 2 are isomorphic nondeterministic algebras, then they 
have the same amount of nondeterminisrn, which fe c nof necessarily the case if A 2 is a 
homomorphic image of A r is 

It can be shown that the results from conventional heterogeneous algebras in [4] 
extend to the extended heterogeneous algebras. In particular, we can show that 

Prop. A2.I If R is a congruence on an extended heterogeneous algebra A, then there exists 
an onto homomorphism from A to A/R. I 

Prop. A2.2 If* is an onto homomorphism 'from A r to A "then the kernel R of* on A Jt 
where R = { K & | D: € A'} and R,y = { <v< V> J*,y(i) =? * D «(»0 }* is a congruence on A l . 
■ 



229 



The following diagram in which $ is an onto homomorphism from A ) to A 2 , R is 
the kernel of* on A ]t H is the homomorphism induced by R from A x to A x /R, and *' is an 
isomorphism from Aj/R to A 2 , commutes, i.e., * = #' • H. 



H>' 




2. Proof of Theorem 2,2 

Thm. 2.2 Assuming that E BOO , is the largest congruence on a model of Bool, E is the 
largest congruence on A. 

Proof By induction on type algebras. 

Basis: A = 0, the null set 

(i) Bool - the statement holds because of the assumption. 

(ii) D other than Bool - since every value in V« is observably equivalent to every other 
value, the statement is true. 

Inductive Step: A £ 0, 

Assume that the statement holds for each D' € A. , 

To prove the statement for D, we must show that E D is the largest equivalence relation 
such that E is a congruence on A. We prove this by contradiction. 

Suppose E D is not the largest equivalence relat|m and E'^ is a larger equivalence 
relation containing E^ such that E' = { Ejy | D' € AJ} U { E'^ } is a congruence on A. 
Tliere exists <v, v'> € E'j } such that <v, v*> £ E |} . So, thfere is a c(x) of typ£ D' € A such that 
there is an interpretation of cfx/v] in A distinguishabJt from every interpretation of <\x/V\ 
in A or vice versa. But, this is contradictory to E' being a congruence which requires that 
for every interpretation v of cfx/v^ in A, there is an interpretation v' of cfx/v'] in A such 
that <v r vj> e E^, , and vice versa. So, E^ is the largest equivalence relation. I 



230- 



Modificalion for type algebras Having an exception domain 

The proof has the same structure as above, except that we also have to consider 
the case when < v, v> I E D implies that v and V afe dtstfrtgtMshabfe because a computation 
c(jr) (i) signals on v and returns a normal value on y, or vice versa, or (ii) signals 
distinguishable exceptional values on v and V. la the basis *ep, for the case of D other 
than Bool, E D need not be the universal relation on ¥ & . 

3. Elaboration of the Definition of Behavioral Equivalence and 
Proofs of Theorems 2.5 and 2.6 

In Section 2.2, we defined two type algebras to b&tMavtbraify' equlvafentff their 
reduced algebras are isomorphically equivalent. We further elaborate on this definition. 
We prove Theorems 2.5 and 2.6 of Section 2.2. The discu^oHand theorems of this section 
extend to modified type algebras having the exception domain. The set of mappings from 
a modified type algebra A to another modified type al|lb 1 rW , A*' includes : a mapping from 
the exception doi.iain of A to the exception domain of A' wlfich gets defined by me 
mappings on the normal domain! '' 

As is discussed* in Subsection 2.£5, tne"l*eTiavidral equivalence of type algebras k x 
and A 2 can be expressed as 



V 


) — 


- A 2 


1 




1 


1 


■-.<."■: : ;■: 


■ \ 


1 




1 


h; Y 




Y»V 


I 




r ■' "'•' 


i 




I 


VE,- 


♦ 


-v** 


the above diagram commutes, i.e., 


* H i = "2* 


*. 


(t) 



where Aj/Ej and A 2 /E 2 are the reduced algebras corresponding to A 2 and A 2 respectively 



231 



and $ is the isomorphism defined by the isomorphic equivalence of A [ /E i and A 2 /E 2 . The 
equation (t) above defines the set * of iriahy tor many mappings, inhere 
* = {>,,,: V,^ V^,| D' € A ufD}}. 

We first discuss how for two isomorphicatty equivalent algebras A ] and A 2 , the 
bijection *, } in an isomorphism * can be constructed, and show that the interpretations of a 
ground term e in A y and A 2 are 'equivalent.' Later, we discuss these properties for 
behaviorally equivalent algebras. 

3.1 Isomorphically Equivalent Type Algebras 

For the case when the deterministic constructors of a data type D can generate all 
the values of D, we have 

Thm. All If A 1 and A 2 are isomorphically equivalent, then { * l} . | €>' £ A } uniquely 
determines the bijection $|j. 

Proof By definition of isomorphic equivalence, there exists a bfjectiorr * D : V^ -+ V^ 
such that 4> = { <J> D . | D' € A' } is an isomorphism. We prove the statement by 
contradiction. Let us assume that ,$y, is n0t^^^.,i^^l^<-0^ r 9f e s -two bijections f ^ 
and <^such that* 1 .5= { % JJ>' € A } U { *^ } and* 2 = { % I D' € A } U { *$} are 
isomorphisms. 

Since *^ and *jj are different, there exists v € V,j , *^(v) * 4>^(v). We pick a v 
that can be constructed by the minimum number (say ,k) qf applications of the 
deterministic constructors and on which *j^ and 4>^ differ. We have v = fj(v r . . . , v n ) for 
some <t, and if D = D, v can be constructed by k\ < k number of applications of 
constructors; thus, *q(v.) = *|>( v 4 )- 
By the definition of isomorphic equivalence, 

*1>W = f >D< V l>- • • • • *D< V I> *D < V n»' 

1 n , , 

meaning that *^(v) = *^(v), which is a contradiction. 

So, there are not any v such that *^(v) ' # *p(v). 
Hence the proof of the theorem! I 



232- 



We can construct the bijection # D as follows: 
For every constructor »: D } x . . . x D -* D 

(% (v) = v; a . . . a * D (v n ) = v;> - • D (fi(v 1 , . . . , v n » = f>; , . . . , v; ). 

1 n 

The case of <rs not taking any argument of type D serves as the bass step in the 
construction of * D . 

The above theorem holds in case A J and A 2 are reduced even if some of the 
values of D cannot be constructed without using a nondeterramistic constructor. However, 
it does not hold in general; for example, consider a variation of the type algebra A* § for 
SeMnt denoted by A* , having everything- eW^ sah% aSWW^ extent that In 5 , the 

SI ol 

interpretation of the operation Insert, is a nondeterministic function, which appends the 
integer being inserted to the beginning of the sequence representing the given set or at the 
end of the sequence. 
In 5 <</,, ..., / m >, ^ £ C </ Jt .... / m > 31 < j £m> /. - = k 

I </. /,/> or </,/-,.„,/> otherwise. 

v. i m i m 

A^j is clearly isomorphically equivalent to itself and there is fliore than one isomorphism 
fromA^ to itself. 



Thin. A2.2 Given two isomorphicalty equivalent type algebras A } and A 2 defining an 
isomorphism 4>, a value v of type D in A 1 has the same observable behavior in A\ as# D (^ 
in A 2 in the sense that for every term c(x) of type D" € (D) with one free variable bftype 
D, 

v({ct*/vii A }) = {^/* p (v>rr A >. 

Proof By induction on the depth of # in c(x). 

depth(x) = 0. 

depth(a(e r ..., ej) = max(6epth(e^ t ..., depth(ej) + 1, ' . 
where e. has x as a variable. 

Basis depth(c(x)) = 0. 

So, c(jr) is x, and the statement of the theorem trivially holds. 
Inductive Step Assume the statement of the theorem for the case when 
depth(c(x)) < k > 0, to show for the case when depth(c(x)) = k. Let 



233 



c(x) = o(e v .... e n ), 
where e is of type D.. We assume that the statement holds for each e, so 

♦|>.< * 4 x/ $ A > > = < 4 x/ %(Ma }• 

V<l ^^1 A 1 > > * *IV << f l <W* /! * A >.-^J^HIa- })}) 
= I #*„ ({ ejx/vll A }) * |} ({ e n f*/v|| A ) }) } (since <t> is an isomorphism) 



- { o(e x e n )[x/* n (m A } = { c[x/* n (v)]| A }. I 



"2 



For the case of modified type algebras, we are interested in terms that such that cfx/v|| A 

M i 

and cfjf/^i^v)]! A are not undefined. 

3.2 Behaviorally Equivalent Type Algebras 

Thm. A2.3 If A : and A 2 are behaviorally equivalent, 
then <v, V> € * D =* <H M > € * D . 

Proof Obvious from the diagram. Since <J> • H = H 2 • ♦, from <v, v'> € * D , we get 

♦ D (M> = W ■ 

We now present the proofs of Theorems 2.5 and 2.6 of Subsection 2.2.5. 

Thm. 2.5 For behaviorally equivalent A x and A 2 , for eyeiiy g|ound term e of, ^pe 
D" € (D)*, for every v € { e\ A }, there is a V £ { d a- } such that< [ v],[V ]> € 4> & . , and 
vice versa. 

Proof By induction on the structure of type algebras. 

I.Basis A = 

(i) D is Bool: Since all behaviorally equivalent algebras are isomorphic and the 
observable equivalence relation is the identity relation, the above is true. 

(ii) D is other than Bool: Since the observable equivalence relation is the universal 
relation, the above is true. 

/. Inductive Step A £ 
Assume that the above statement holds for all ground terms of type D" £ (Tj) + not 



-234 



having any operation symbol in Q. (1) 

To show for a ground term e by induction on number of operation symbol from a in e. 

2. The basis step holds because of the assumption. • 

2 Inductive Step Assume for ehaving %' < k occurrences of operation symbols from Q, 
to show for e having k occurrences. (2) 

This is also proved by induction on tjhje depth of the outermost operation symbol 
from in e. 

deptmXej c n )) = if a € Q, and 

depthfX^ e n )) = minCdepthf^), .... depth**}) + 1 if a € Q. 

3. Basis depth(e) = 0, i.e., e = a(e, e ), and o till > ; 

in 

So, an e. can have at most k-1 occurrences of operations from Q. 

We prove the statement of the theorem in one direction; the proof in the other 
direction is the same except that v is to be replaced for v'. 

If v€ { c| A }, i.e., if [v] € { e | A /£ }, then? is a choice of g^ , the interpretation of a 
in AVE,, such that 

[vj = g^([v 1 ], ..., [vj), where [vi €{ e I A /£ } for each 1 ^i ^n. 
By inductive hypothesis (2), for every (v.] € { e| A /E }, there is a [vj] € { e| A /E } such 
that 4> D ([VjJ) = (v!j. Because 4> is an isomorphism, there is a choice of g* such that 

%<W *M = ^(fv;], .... [v'J) meaning inatf V'€ { e\ A >. 

3. Inductive Step Assume for e having deptnf^) < m> r ioshbw for e having 
depth(e) = m. (3) 

e= a(e 1 ,...,e) o€Q. 

The proof goes the same way as for the basis step except that w§ use the models 
of the data type D' that has the operation \o. I 

For modified type algebras, we are fnterested in ground terms whose irtteipretatrons are not 
undefined. It can be.shown for behavioralry cqtrivalent type algebras A ] and A 2 that if for 
some ground term e, e\ * is undefined, then e\ « is also undefined imd vice versa. I 

M l **2 

Thai. 2.6 For behaviorally equivalent A l and A if for any ground terms e t and e 2 of type 



-235 



D", { [e, I A ] } = \[e 2 1 A] ] } ~ { [e x I ^ ] } = {[e 2 I ^ ] }. 

Proof From the above two theorems and the fact that A /E and A 2 /E 2 are isomorphically 



equivalent, the statement is immediate. I 



-236 



Appendix III - Proofs of Theorems in Chapter 4 



This appendix contains proofs of various theorems in Chapter 4. 

1. Specifications without Nondeterminism and without 
Exceptional Behavior 

Thm. 4. 1 Every constructor ground term e of type Set-Int' is equivalent by equational 
reasoning to a ground term e' not having any occurrence of Remove, i.e., the equation 
e=e'€EQ(SeHnt'). 

Proof For every constructor ground term e of type Set-Int', there is a constructor ground 
term e' such that 

(*) e = <?" € EQ(Set-Int') A tfreOO = 0, 
where #rc(e) gives the number of occurrences of the operation symbol Remove in e. 
Similarly, the function #in gives the number of occurrence of the operation symbol Insert 
in a term. We show (*) by induction on #re(c). 

Basis #re(c) = 0, 
The above statement trivially holds, because d is same as e. 

Inductive Step Assume the statement holds for esuch that #re(e) < k, 
show for #re(e) = k. 
Consider the outermost subterm e x in e such that e - Remove^, il). Qearly, 
#re(e n )<k, so there is a subterm e' u such that 'e u m e' n ' € EQ(Set-Iirt') and 
#re(e; i ) = 0. Thus we have \ = Remove^ , il)' € EQ(Set-Int'). We show that (*) 
holds for Remove^, il) by induction on J^hK^ )• 

Basis XinUfi^ ) = 0. 
'e { == Removc(Null, il) 
= Null' € EQ(SeMnt) using Axiom 1. 

e" is obtained by substituting Null for e x in e. 



237 



Inductive Step Assume the above holds for #in(e' ) < m, 
to show for e having m Insert's. 
e n - Insert(e 21 , i2), so 
, e l = Remove(Insert(e 21 , i2), il)' € EQ(Set-Int'). 

Tli ere are two cases. 

Case / il = i2 

, e 1 = Remove(e 21 , il)' € EQ(Set-Int'). Axiom 2. 

By the inductive step, there is an e 2l such that 
Remove(<? 2] , il) = e 2] ' € EQ(Set-Int') and #re(<? 21 ) = 0. 
So, 'Cj = e 2] '■€ EQ(Set-Int'). 
We get e' by replacing e 1 by e' 2l . 

Case 2 ~ il =■ i2 
•e l = lnsert(Rcmove(e 21 , il), i2)' € EQ(SeMnt'). Axiom 2. 

By the inductive step, there is a e 21 such that 

'Remove(e 2l , il) = e\ x ' € EQ(Sel-Int'), and thuj i e [ = Insert(e 21 , i2)' € EQ(Set-Int'). 
We get e' by replacing e by Insert(e 21 , i2). I 

Thm. 4.4 If a specification S is sufficiently complete, then S is behaviorally complete. 

Proof If S is inconsistent, then since F(S) = 0, so S is trivially behaviorally complete. 

If S is consistent, we show that a sufficiently complete S is also behaviorally complete by 
contradiction. 

Suppose S is not behaviorally complete, so there exists two reduced algebras A 2 and A 2 in 
F(S) that are not isomorphically equivalent w.r.t { P | a € Q }. Without any loss of 
generality, we can assume that A and A share the same domain corresponding to a 
defining type, so for each D' € A, 4>. )( is the identity function. Since every constructor is 
deterministic, there is a unique mapping *.. : V.I-+ V. 2 . which can possibly satisfy the 
following for every a in 0. 

for each set of values v„ . . . , v , such that P [x/v,, . . . , x /v II A = T, 

In a 1 1 J n n JI A 



238 



If Aj and A 2 are not isomorphically equivalent w.r.t { P | a € Q }, this means that there 
must exist an observer a and a set of values v„ . . . , t such that P ijr/v n , .'. . , x /v II A 

1 n <r l 1 1 n n J ' A 

holds and (*) is not satisfied. 

Using the minimality property, we can construct a legal ground term o(e v .., e n ) of 
type D' € A, where D' is the range of a, and for each 1 <i < n, e is the ground term whose 
interpretation is v. in A . Since S is sufficiently complete, there exists a ground term d of 
type D' not having any operation symbol of D and auxiliary function used in S such that 
' fffcj e n ) = e" % € EQ(S). This means that 

because A } and A 2 are reduced algebras. This is in contradiction to (*) not being satisfied. 
Hence the result. I 

Thm. 4.6 For a consistent and sufficiently complete S, if any two legal ground terms e x and 
e 2 of type D are distinguishable by S, then i e l d e^ € DS(S). 

Proof: e 1 and e 2 are distinguishable by S, means that for any A € JF(S), ej[ A and ej A are 
distinguishable, i.e., there exists a term e(x) of type JD' ..^4 witt) one free variable x of type 
D such that c[x/vj | A is distinguishable from cfx/v 2 ] | A in A. 
Using the above fact, we prove the theorem by induction on specifications. 

Basis Specifications with no defining types. 
Case I Bool 

T d F € DS(Bool). Every ground term of type Bool is equivalent to either T or F, so 
the theorem holds. 

Case 2D other than Bool 

Alt ground terms are observable equivalent, so die theorem holds. 

Inductive Step Assume the above statement for the specification ST of a data type D' used in 
the specification S of D. To show for S. 
We can prove by contradiction that *e ^ e^ € DS(S) as follows: 
Assume e l se 2 



then cjx/el = s[x/e^, 



-239 



since S is sufficiently complete, there exists ground terms e| and e' 2 of type D' such that 
e' v e' 2 do not have any occurrence of an operation symbol of D, and ' e } =5 e\ ' € EQ(S) and 
' e 2 = e' 2 ' € EQ(S), so we have 'ej = e' 2 € EQ(S), Since e;, e 2 are distinguishable by S', by 
inductive hypothesis, k e[ a£ e 2 ' € DS(S'), so \ej ^ e' 2 is also in DS(S). This is a 
contradiction, as S is consistent So, 'e ^ e 2 € DS(S), I 

2. Specifications with Exceptional Behavior and without 
Nondeterminism 

Thm. 4.9 Every legal constructor ground term e of type Stfc-Int such that 
'N?g lk .| nt (<?) = T' € EQ(Stk-Int), is equivalent by eguationai feasoning to another legal 
constructor ground term e' having only Null and Push, i.e., if 'N?^.^^ s T € 
EQ(Stk-Int), then ' e = e 1 ' € EQ(Stk-Int). 

Proof Proof is similar to that of Theorem 4.1 above. 

l>et #po and #rep be the functions oh terms comparing number" Of occurrences of Pop 
and Replace respectively. We show hy induction oh #po(e) + #rep(e) that 

(*) if 'N? stk . int (e) s T € EQ(Stk-Int), then there exists an Y such that ' e s e" € 
EQ(Stk-Int)and #po(£ ? ) + #rep(^ - 0. 

ftms #po(e) + #rep(e) = 0, 
eservesasc'. 

Inductive Step Assume (*) above for the case # po(e) + # rq<€) < k» 
to show for #po(e) + #iep(£) = k. 
Consider the outermost subterm e L in c having Pop or Replace as the outermost 
operation. It is obvious that if 'NSgtk-intte) ■ T £ EQ(Stk-Inl), then Wg^.f,^) 5 T 
CEQ(Stk-Int). 

Case! e y = Potf^) 
Since 'N?;^.!,,^) = T € EQ(Stk-lntX by inductive step, there exists an e' n such 
that * e u s e n ' € EQ(Slk-lnt) and #poten ) 4 fffep^ ) =» 0. 

Since 'N?^.,,,^) s T £ EQ(Stk-Int), ^ is not Null and so ej x = Push(e 21 , i). 
Thus e x = Pop(Push(e 2r i)) = e 21 ' € EQ(Stk-W) Axiom 1. 



240- 



By replacing e x by e n in e, we get the required e\ 

Case 2 e y = Replace^, il) 
Since 'Ntg^.j^^j) = Te EQ(Stk-Int), by inductive step, there exists an e' u such 
that ' e u s ej, ' € EQ(Stk-Int) and tfpote^ ) + tfrep^ ) = 0. 

Since N?^.,^) = T' € EQ(Stk-Int), ^ ^ is not Nuh\ and so e^ = Push(e 21 . i2). 

Thus ^ = ReplacfKPushC^, i2X il) 

= PuslKPoiKPusK^, i2)), il) Axiom 3 

s Push(e 21 , il) Axiom 1 

So e, * Pusb(e 21 , il)' € EQ<Stk-lnt). 

By replacing e x in v by Ptislif> 2 j, il). we get me required eVl 

Thin. 4.12 If a specification S is sufficiently complete, then S is behaviorally complete. 

Proof If S is inconsistent* fjasn since HS) = 0, so S is trivially, berjaviorafly complete. 

If S is consistent, we show that , a sufTiciently couiplete S is behaviorally complete by 
contradiction. ; r s .v* t 

Suppose S is not behaviorally complete* so there exists two reduced algebras Aj 
and A 2 in f(S) such that for every D' € A, the domain corresponding to D' in A and A 2 are 
defined by isomorphically equivalent algebras in f(S'), where S" is a specihcauon of D\ 
and A 2 is not partially isomorphically embeddable w.r.t S in A . Without any toss of 
generality, we can assume that A 5 and A ? share the same darTikm correspondmg to a 
defining type, so for every D' € A, # D - is the identity ftinctfcm. Since every constructor is 
deterministic, there are unique one to one partial: functions t^ : V ^ -► V^ and 
^ v : EXV -* EXV which can possibly satisfy the requirements for A 2 to be partjafty 
isomorphically embeddable in A x (see Def.3.13 of isomorphic embeddabihty in 
Section 3.5). The first two requirements there can be easily satisfied. The third 
requirement is complex and is restated below: 

For every operation a € O v for every set of values v r . .. v v^ such that * D (Vj) is defined 

i 

for each 1 <t< n, and ^ a [x i /v v .... Jt/vJ^ =c f^ 
(a) if f^ signals an exception value ex(vj, . . . , >J| specrrledifij be optional by S 6n the 



241 



input v x v n , then the associated condition 0(x r .» * n ) holds for v r . . . , v n , and 

f ^*l) (v i>' ••■•*» ty) either signa,s ex &D'( v ?* • • ' *D'^ v m^ or returns *»'W & 

1 n 1 m 

some v*, or 
(b) if * D <(vp %'^ V J are defined and f* sisals an exception value 

1 m 

ex(4>.y(v') 4> D . ( v ' )) specified to be optional by S on the input <J> D (v x ), . . . , 4> D ( v n ), 

] m m In 

then the associated condition 0(x ,.., x ) holds for *» (v.), . . . , * D (vj, and 
^(v,, . . . , v ) either signals ex(v' . . . , v') or returns v'; otherwise, 

(<)%«>! V o )) = f> I> (v i X....» |) (V 11 )) (*)• 

1 n 

For A 2 not to be partially isomorphically embeddable in A r at least one of the 
above conditions is not satisfied. Supposingly if the condition (a) is not satisfied, we have 
f>,) ( v.) *„ ( v n )) * «CV<vJ) * D • (v^), 

1 n 1 m 

meaning that A 2 does not satisfy the optional exception condition for a in S, which is 
contradictory to the assumption that A 2 € F(S). So, the condition (a) could not have been 
violated. Similarly, it can be shown that the condition (b) could not have been violated, 

The violation of condition (c) is then the only possibility. In that case, for some 
a€0, 

(i) exactly one of the two sides of the equation (*) signals an exception, 

(ii) different sides signal different exceptions, or 

(iii) different sides return different values. 
Using minimality property, we can construct a legal ground term e = o^,.., e n ) of type D', 
where for each 1 <i < n, e { is the ground term whose interpretation is v. in A r The 
possibilities (i) and (ii) above are ruled out because of the following reasons: 

For both (i) and (ii), the exception signalled by either side must be different from the 
optional exception. Since S is sufficiently complete, either W^e) = T£ EQ(S), or 
'N? jy (e) = F G EQ(S). If 'N? iy (e) = T € EQ(S), then none of c| A and e\ A can be an 
exception value, ruling out (i) and (ii). If *N? D -(e) = F € EQ(S), then 'e signals ex? € 
EQ(S) for some ext meaning that 

4 A = ^A =ex/ U 
again ruling out(i) and (ii). 

The only possibility is (iii). Then e must be type D' € A, as if e is of type D, then 



242 - 



the definition of 4> D ensures that the equation (*) is satisfied. We have either N? D ^e) s T 
£ EQ(S) or neither W„<«) m T € EQ(S) nor *Nf^i»)s F € EQ(S). 1f"N?jy(i?)«T € 
EQ(S), then there is a ground term ef without any operation symbol of D and auxiliary 

functions used in S such that 'es^ € EQ(S); so e| * *= e\ A s « *'| A ruling out (iii). If 

M i *?i "1 

neither l N? D <e) x T € EQ(S) nor 'NTj^^sb F! € EQ(S^ tte^risp them exists a ground 
term c' without any operation symbol of D and auxiliary functions used in S such that 
4 e s e" € EQ(S U { N?^*?) s T }), which again rules put (in)ubecause of the reasons 
similar to the ones discussed abovet. ~... 

The above thus implies that A 2 is partially isomorphically embeddable in A r 

Hence the result I 



Th*. 4.13 For a consistent and sufficiently completes, if amy two legal ground terms e l 
aa& e 2 of type D are distinguishable &y S, tfieh -*e\ *^€MD6(Sfc 

Proof: e and e 2 are distinguishable by S, means that for any A € F(!S), ej ^ and ej ^ are 
distinguishable, i.e., 

(a) e. | A is an exception value and e\ *. is a normal value, 

(b) cj A and ej A - are distinguishable exception values, or 

(c) e,l a and eA A . are normal values and tfiere exists a term e(x) oftype D' € A u { D } 
with one free variable x oftype D such that cfx/v,] j A is distinguishable from c[x/vj | » in 
A.' ■■--.• ~ '• '• 

Since S is sufficiently complete, it can be shown that if 

(i) a ground term e interprets to an exception value in every algebra A € fl[S), then 
*N? D <^s F € |#(S), and also 

(ii) if e interprets to a normal value in every algebra A € F(S), then 'N?^e)=E T € 

E0(S). V 



-243 



Using the above facts, we prove the theorem by induction on specifications. 

Basis Specifications with no defining types. 
Case I Bool 

'TiFe DS(Bool). Every ground term of type Bool is equivalent to either T or F, so 
the theorem holds. 
Case 2 D other than Bool 
Subcase I S does not specify any operation to signal, 

All ground terms are observable equivalent, so the theorem holds. 
Subcase! S specifics operations to signal 
Assume e { and e 2 are distinguishable by S, so there is one of the above three 
possibilities. We show in each case how 'e d e 2 can be derived in DS(S). 

(a) Since S is sufficiently complete, TN?^) = F € EQ(S) and 'N£ iy (<? 2 ) =s T € 
EQ(S), and by the axiom (vii) in Subsection 4.3.3, i e l ± e 2 € DS(S). 

(b) by sufficient completeness of S, usirigthe axibrri Xvi) in Subsection 4.3.3 and 
repeatedly using the argument in case 2, we get > e 1 ± e 2 € DS(S). 

(c) By the substitution property of the operations, and the sufficient 
completeness of S, we get i e l al e 2 € DS(S), by the method of proof by contradiction. 

Inductive Step Assume the above statement for the specification S' of a data type D' used 
in the specification S of D. To show for S. 

Assume e and e 2 are distinguishable by S. For the possibilities (a) and (b), the argument 
used in the basis step applies. For the third possibility, in addition to the case considered in 
the basis step, we have the case when the interpretations of e and e 2 are distinguishable in 
A because of a computation c(x) returning distinguishable results of type D' € A. For this 
case also, we can prove by contradiction that 'e £ e ' € DS(S) as follows: 

Assume e l se 2 

then c[x/ej = cfr/ej, (*) 

We have three subcases: 

Subcase J Both sides of (*) interpret to a normal value in A. 

Since S is sufficiently complete, there exists ground terms e' x and e" 2 of type D' 
such that e' v e' 2 do not have any occurrence of an operation symbol of D, and ' e t s e' x \ 



244- 



' e 2 s e' 2 ' € EQ(S), so we have k e\ = e' 2 € £Q(S). Since ej, ^ are distinguishable by S\ by 
inductive hypothesis, ' e\ ± ej''€ DS(S'), so '*; rf^' is also in DS(S). This is a 
contradiction, as S is consistent So, 'e 1 £ e 2 € DS(S). 

Subcase 2 Oneof the two sides of (*) interprets to a normal value. 

Without any loss of generality, assume l.h.s. interprets to a normal value. By 
sufficient completeness of S, there is a e\ such that '« = c[' € BQ(S), and there is an 
exception ground term ext such that y e 2 signals ext € EQ(S), so again, we have using the 
axioms, \ ± e 2 € DS(S), 

Subcase 3 Both sides of (*) interpret to distinguishable exception values. 

Using the sufficient completeness of Si we can show using a similar argument that 
\ ± e 2 € DS(S). 
Hence the theorem. I 

3. Specifications with Exceptional Behavior and 

Nondeterminism 

Thm. 4.14 /and TR(/) are semantically equivalent 

Proof By induction on structure of/ We only need to show the basis step; the inductive 
step is straightforward because the symbols ~, V, and V have the same, interpretation. So, 
we have /as k e l ss e r ' Consider an extended type algebra A of D in which /and TR(f) can 
be interpreted (i.e., A has an interpretation for every nondeterministic operation symbol a 
and the corresponding auxiliary function symbol a_p such that the interpretation of the 
auxiliary function is the relation computed by the interpretation of the nondeterministic 
operation symbol). 

Case (a), /does not have any occurrence of a nondeterministic operation symbol. 
TR(/) = f, so the statement trivially holds. 

Case (b). Both e x and e 2 have occurrences of nondeterministic symbols' 

It is obvious from the description of the procedure TR in Subsection 4.4.1 that the 
interpretation of , e l ■=. e 2 is equivalent to the interpretation of TR(f). 

Case (c) Exactly one of ^ and e 2 has occurrences of nondeterministic symbols: Again from 



-245- 



thc description of TR in Subsection 4.4.1, the interpretation of 'e ] = e 2 ' is equivalent to the 
interpretation of TR(/). I 



-246 



Appendix IV - Specifications of Data Types used in Chapter 5 

In this appendix, we give specifications of the data types Null, 
Struct [« : D r . . . , n k : DJ, Oneof [tty D v . . . , n k : DJ, and Sequence-Int used in Chapter 5. 
Struct, Jnd Oneof are type schema. Below, we specify an instance of these schema 

assuming fixed but unspecified parameters, i.e., k as well as D 1 D k are fixed. Since the 

specification is given for an arbitrary k, we have used the '...' notation. The specification of 
any particular instance, such as Oneof [empty: Null, pair: Pair], 
Struct fear: Int, cdr: List-In t| used in Chapter 5, can be given without using the '...' 
notation. 



Figure A4.1. Specification of Null 

Operations 



Nil 


: -» Null 


Equal 


: Null X Null 


Axioms 




Nil = Nil 


s T 



Bool as X1 a x2 



-247- 

Figure A4.2. Specification of Struct |/i : D v . . . , « k : DJ 

Struct [fly D 15 ... , /2 k : Dj as D 
Operations 

Create : D 1 X . . . X D k -» D 

Fetch./ij : D -* Dj 



Fetch_W k : D -» D k 
Replace./!} :DxDj-»D 



Replace_n k : D X D k -* D 

Equal : D X D -» Bool as x1 = x2 

FetcruH^Createtxl xk)) = x1 



Fetch_/? k (Create(x1, ..., xk)) = xk 
Replace.ft^Createtxl, ...,xk), y1) = Create(y1, ..., xk) 



Replace_/2 k (Create(x1, ..., xk), yk) s Create(x1, .... yk) 
Create(x1, ..., xk) = Create(y1, ..., yk) = (x1 = y1)A...Afxk = yk) 



-248- 

Figure A4.3. Specification of Oncof [n^ D 1§ . . . , fl k : DJ 

Oneof [Wj: Dj n k : D k ] as D 

Operations 
Make_«j : Dj -» D 



Make_/? k : D k -> D 
Value_« 1 : D -» D x 

-♦ wrong-tag 



Value_« k : D -» D k 

— * wrong-tag 
ls_/lj : D -» Bool 



ls_n k : D -» Bool 

Equal : D X D -» Bool as x1 = x2 

Restrictions 

~ ls_w 1 (x) => Value_flj(x) signals wrong-tag 

~ ls_« k (x) =* Value_/i k (x) signals w rong-tag 

Axioms 

Value_/i l (Make_/i l (x1)) = x1 



Value_/7 k (Make_/i k (xk)) = xk 
ls_/l l (Make_/l I (x1)) = T 



ls_//,(Make_« k (xk)) s F 



Is./i^Make./l^xl)) = F 



249 



ls_/7 k (Make_tf k (xk)) = T 

Make_/7j(x1) = Make_« 1 (yl) = x1 = y1 



Make.fl^xl) = Make_« k (yk) = F 



Make_tf k (xk) = Make.rt^yl) = F 



Make_/? k (xk) = Make_/? k (yk) = xk = yk 
x = y = y = x 



-250 



Figure A4.4, Specification of Sequence-Int 



Sequence-lnt as SI 




Operations 




New 


: -» SI 




Addl 


: SI X Int -> St 




Addh 


: SI X Int -» SI 




Concat 


: SI X SI -» SI 


as xl *x2 


Subseq 


: SI X Int X Int -» SI 

— ► bounds 

-+ negative-size 




Fill 


: Int X Int -♦ SI 

-» negative-size 




Fetch 


: SI X Int -* Int 

-» bounds 


as x[i] 


Bottom 


: SI -» Int 
-* bounds 




Top 


: SI -* Int 

-+ bounds 




Reml 


: SI -» SI 

-» bounds 




Remh 


: SI -t SI 

-+ bounds 




Size 


: SI -» Int 




Empty 


: SI -► Bool 




Replace 


: SI X Int X Int - St 

-» bounds 




Index 


: SI X Int -» Int 

-» element-not-ln 




Member 


: SI X Int - Bool 




Equal 


: SI X SI - Bool 


as x1 = x2 


Restrictions 





(iK 1 V i1 > (Size(s) + 1) ) => Subseqfs, il , i2) signals bounds 

(~ (n< 1 V 11 > (Size(s) ♦ 1) ) A (i2 < 0)) => Subsoq(5, it, 12) signals negative-size 

i < => FilKi, j) signals negative-size 

(i < 1 V i > Size(s) ) => FetcMs, i) signals bounds 

Size(s) = => Bottom(s) signals bounds 

Size(s) - =*► Top(s) signals bounds 

Size(s) = =* Reml(s) signals bounds 

Size(s) = =* Remh(s) signals bounds 

(i < 1 V i > Size(s) ) =*• Replace(s, i, j) signals bounds 

~ Member(s, j) => lndex(s, j) signals element-not-in 

Axioms 



AddKNew, j) = Addh (New, j) 

Addl(Addh(s, jD, j2) = Addh(Addl(s, j2), j1) 



251 



s • New = s 

s1 ■ Addh(s2, j) = Addh(s1 • s2, j) 
Subseq(s, M , 0) = New 

Subseq(Addh(s, j), M , i2 + 1 ) = if (M + i2) < (Size(s) + 1 ) then Subseq(s, M , i2 + 1 ) 
else if (M + i2) = (Size(s) + Uthen Addh(Subseq(s, i1, i2), j) 
else Subseq(Addh(s, j), i1 , Size(s) - i1 +2) 
Fill(0, j) = New 
Fill(i+1, j)= Addh(Fill(i, j), j) 

Fetch(Addh(s, j), i) = if i = Size(s) + 1 then j else Fetch(s, i) 
Bottom(s) = Fetch(s, 1) 
Top(s) = Fetch(s, Size(s)) 
Reml(s) = Subseq(s, 2, Size(s)-1) 
Remh(s) = Subseq(s, 1 , Size(s)-1) 
Size(New) = 

Size(Addh(s, j)) = Size(s) + 1 
Empty(New) = T 
Empty(Addh(s, j)) = F 
Member(New, j) = F 

Member(Addh(s, j1), j2) = if j1 = j2 then T else Memberfs, j2) 

Replace(Addh(s, j1), i, j2) = if i = Size(s) + 1 then Addh(s, j2) else Addh(Replace(s, i, j2), j1) 
Fetch(s, lndex(s, j)) s j 
x = x = T 
x = y = y s x 
New = Addh(s, j) = F 
Addh(s1,j1) = Addh(s2,j2) = (j1 = j2)A(s1 s s2) 



